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Foreword 


This  is  the  fourth  of  a  series  annual  workshops  sponsored  by  the  Naval  Surface  Warfare 
Center  (NSWC).  These  Systems  Reengineering  Technology  Workshops  are  motivated  by  the 
fact  that  the  Navy  has  invested  billions  of  dollars  in  the  development  of  systems  that  may  be 
modified  and  extended  to  respond  to  changing  requirements.  Systems  reengineering  technology 
is  necessary  if  the  Navy  (as  well  as  the  other  users  of  large-scale,  complex  systems)  are  to  benefit 
from  their  extensive  investments. 

The  workshop  brings  together  representatives  of  government,  industry,  and  academia  to 
address  the  issues  confronting  this  technology.  Although  the  principal  interest  of  the  sponsors 
is  the  reegineering  of  embedded,  real-time  systems  that  include  hardware,  software,  and  human- 
computer  interaction,  the  workshop  encourages  the  participation  of  all  individuals  and 
organizations  concerned  with  the  reengineering  of  large-scale,  complex  systems. 

These  participant’s  proceedings  contain  the  papers  accepted  for  presentation  at  t!ie 
workshop.  They  have  been  organized  to  according  to  the  order  of  the  program.  The  workshop 
begins  with  a  keynote  address  by  Andrew  P.  Sage.  His  paper,  “Systems  Engineering  and 
Management  for  Reengineering,”  provides  an  introduction  to  the  issues  to  be  discussed  during 
the  workshop.  The  paper  is  abstracted  from  Systems  Management  for  Information  Technology 
and  Software  Engineering,  which  will  be  published  by  Wiley  later  in  1994.  This  opening  paper 
is  complemented  by  Robert  S.  Arnold,  “A  Road  Map  Guide  to  Software  Reengineering 
Technology,”  which  is  included  as  an  appendix.  It  is  the  introduction  to  R.  Arnold  (ed.). 
Software  Reengineering,  IEEE  Computer  Science  Press,  1993.  Thus,  this  volume  begins  and  ends 
with  two  very  important  surveys  that  present  a  conceptual  context  for  reengineering  and  this 
workshop. 

The  papers  in  the  volume  have  been  grouped  into  units  with  common  themes.  The 
commonality,  of  course,  often  is  more  apparent  than  real.  With  the  decision  not  to  have  parallel 
sessions  and  a  commitment  to  accept  all  the  relevant,  quality  papers  that  could  be  accommodated 
in  the  time  available,  the  workshop  organizers  were  not  always  able  to  focus  a  session  on  a 
central  theme.  We  overcame  this  difficulty,  in  part,  by  structuring  some  of  the  workshop  sessions 
as  panels  rather  than  paper  presentations.  In  those  cases,  the  papers  in  these  proceedings 
constitute  an  extended  discussion  of  concepts  that  may  have  been  presented  only  briefly  in  the 
workshop. 

By  the  standards  of  most  academic  conferences,  these  proceedings  are  long.  Because  the 
literature  on  systems  (as  opposed  to  software  or  organizational)  reengineering  is  sparse,  we 
elected  not  to  place  any  page  limits  on  the  papers.  Thus,  the  papers  are  as  long  (or  short)  as 
they  needed  to  be  to  convey  what  the  authors  believe  is  important.  As  the  organizers  of  this 
workshop,  we  hope  that  those  who  read  these  proceedings  will  concur  with  this  decision. 

Mark  Wilson.  Workshop  Chair 
Bruce  Blum.  Program  Co-Chair 
Gilbert  Myers.  Program  Co-Chair 
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Aistnct  This  p«|)cr  presents  and  overview  and  pcrs|)ccti\'c  un 
ijysteins  engineering  and  systems  management  for 
reengpiieering  and  related  approaches  towards  organhuitkNial 
and  technology  re\'italizatioiL  As  we  will  see,  there  arc  at  k‘x<4 
three  types  of  reengpnoering  that  can  be  considenrd: 
reengpieering  at  the  lewis  of  product,  process,  and  sy  stems 
management  Wedaim  that  all  throe  are  generally  needed  and 
an  approach  at  one  level  only  may  not  be  satisfactory. 

L  INTRODUCTION 

Responsiveness  is  very  clearly  a  critical  need  today.  By  tliis,  wc 
mean  of  course,  organizational  responsiveness  in  proxiding 
products  and  services  of  demonstrable  value  to  customers,  and 
thereby  to  the  oiganizations  own  stakeholders.  Tliis  must  be 
aooomplished  by  efficiently  and  effectively  employing  leadcisliip 
and  empowerod  people,  sudi  that  systems  management  suategies, 
organizational  processes,  human  resources,  aitd  appropriate 
technologies  are  brougltt  tc  bear  on  tlte  production  of  liiglt  quality 
and  tnistwoitl^  goods  and  services.  We  also  mean  responsiwness 
in  supplying  appropriate  technologies  and  systems  to  nteet  tliesc 
objectives.  Figure  I  illustrates  these  ingredieras  and  some  of  tlieir 
linkages  in  the  production  of  products  and  services.  It  is  a 
composite  representations  that  indicates  some  of  tlic  many 
ingredients  re^nsible  for  trustwortlty  and  higii  quality  products 

Many  recent  papers  [1]  have  indicated  tlte  need  for  oontinuil 
revitalization  in  the  way  in  wtiich  we  do  tilings,  such  tlvat  Uicv  arc 
always  done  better.  This  is  the  case,  even  if  Uie  exicmal 


Fig.  1  Some  of  the  Many  Ma  jor  Ingredients  in  Reengineering 


eiivuDiuneiti  were  sltitic  atid  ia>:lt:uiging  However,  when  we  are 
ill  a  period  of  liigli  velocity  eiiviroiiments.  tlten  oontinuil 
orgiuii/atiouil  cluuige  and  associated  cluuige  m  processes  and 
product  must  be  considered  as  a  fundamental  rule  of  tiie  game  for 
progress. 

In  many  ways,  post  progress  can  aa  to  impede  future  progress. 
Hiis  is  especially  Uie  case  wlicn  we  become  very  accustomed  to  a 
paitkadar  way  of  doing  tilings,  and  Ivive  allowed  a  veiy  large 
oveiiicad  situation  to  accumulate  around  wliai  were  once  highly 
successful  efforts  at  tlie  production  of  quiliiy  products  and  services. 
It  is  especially  dilliailt  to  diangc  wlien  wiiat  we  are  doing  now  is 
done  very  well.  Yet,  it  is  entirely  possible  llwat  a  competitor  may  be 
<^lc  to  do  it  better,  in  any  of  a  number  of  ways.  Also,  vvlvit  we  do 
well  now  may  well  not  be  what  we  will  need  to  be  doing  in  tJie 
future. 

In  many  if  not  most  cases,  improvanciit  needs  come  about  not 
because  of  human  uialtcntion  to  the  tasks  they  pcifomi.  Rather,  and 
more  oUcii  llian  not,  it  suggests  Uut  die  tasks  themscKes  are  in 
need  of  lestudy  and  renovatioa  Tlicsc  tasks  may  be  strategic  in 
iviture.  or  tliey  may  be  taclicaJ,  or  tJicy  may  be  purely  operational. 
Most  olleiv  Ivowever,  attempts  at  improvement  tliiough  attention  at 
only  operatioivil  levels  will  yield  very  modest  iniprovements  for  tlie 
effort  invested  As  many  Ivive  indicated,  it  is  the  (strategic  and 
hKtical.  and  not  tlie  operational)  system  that  is  at  fault 

Fi^irc  2  indicitcs  sonic  iinproveincni  appronclics.  eadi  d* 
vvliicli  rekitc  to  reengineering  Tlicy  arc  iiUcrrclaicd  and  our  listing 
is  not  complete  Oik  of  our  objectives  in  tills  paper  is  to  provide  a 
perspective  <uid  overview  of  some  of  Uk  many  reenginocring 


Fig  2  Some  Approaclics  lo  Reengineering 
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a()firaaches  and  medxxlolQgics  tivu  been  suggested  A  iiukrii 
more  oompleie  dissuasion  is  conlauicd  u>  |2),  and  lJus  paper  ts  a 
summaiy  of  Giaptcr  8  wlucli  discusses  recnguieehng 

Figure  3  represeius  a  generic  view  of  reenguieering  Tire  cnui> 
to  be  reengineered  can  be  eitlier  s>sletus  inanageinent  process,  or 
product,  or  some  appropruiie  ooinbtnation  of  tlrese.  We  niii  expiukl 
on  this  illustration  and  its  interpteraiion  tn  our  discussions  to 
follow. 


Fig,  3  A  Throe  Pluse  Approach  to  Generic  Reengineering 

We  can  approach  a  discussion  of  teenginoering  fiom  several 
penpoctives.  First,  we  can  discuss  the 

■  stnictural, 

■  functionaL  and 

■  purposeful 

aspects  of  lecnginoering.  Altenutclv,  or  in  additioa  we  can 
examine  reengineering  at  the  level  of 

■  systems  nvutageinent 

■  prooess.or 

■  product 

We  may  examine  roengineeriiig  issues  at  any,  or  all  of  tlic  llircc 
fundamental  systems  engineering  lifccvcics. 

■  research,  development  test  and  evaluation  (RDT&E). 

■  systems  acquisibott  procurement  or  prodiictioit  or 

■  systems  plaiming  and  marlceting 

Within  eacli  of  tliese  lifecycles,  we  oould  consider  reengineering  at 
any  or  all  of  the  tluoe  generic  pitnses  of  dchnitiort  development  or 
deployment  At  tlie  level  of  systems  nvuiagement  wc  rociiginocr 
each  of  these  pliases  and  potentially  all  otlier  processes  witliiii  tlic 
company  for  integrated  improvenient  At  tlie  level  of  process 
reengineering  only,  as  we  define  it  only  a  single  process  is 
redesigned,  and  with  no  fundamental  changes  in  die  structure  or 
purpose  of  the  organiration  as  a  vvlioie.  Clvanges,  when  they'  occur, 
may  be  radical  and  revolutionary,  or  incremental  and  evolutionary 
at  the  levels  df  ^ems  nvaaagemcnt  processes,  or  products 
One  fundamental  notion  of  reengineering  is,  however,  die 
reality  that  it  must  be  top  down  directed  if  it  is  to  achieve  die 
significant  and  long-lasting  effects  diat  are  possible.  Thus,  dierc 
should  be  a  strong  purposeful  and  systems  ntaaagcmcnt  oricnlaijon 
to  reengineering,  even  diougli  it  ntay  Itave  well  major  iniplicalioiis 


for  such  lower  level  coiicvnis  as  sinictir.il  ticels  of  a  parucular 
product 

Our  p,i]x-r  IS  org;uii/cd  as  follows  We  first  provide  some 
dcfiiuuoiis  of  reciigiiitvmig  TIicil  we  disaiss  some  five  of  die 
iituiy  pctspccuvcs  dull  luive  boon  uiken  reLuive  to  reengiiwcniig 
llieii,  wc  provide  some  siumiuirv  ooiiuneiils. 

11.  PERSPECTIVES  ON  REENGINEERING 

hi  dus  secuoa  wc  provide  defiriiuoiis  luid  perspectives  on  wliai 
wc  consider  to  be  diroc  related  but  different  types  of  systems 
lueiiguioeruig:  recngiiioeruig  at  die  levels  of 

■  prodiio, 

■  process  or  prodiia  luic.  luid 

■  systems  maruageiiiciiL 

Tltete  leave  been  a  luuiibcr  of  defiiutroa  fomuaJ  and  iiifomial  of 
roen^iocruig  Tlie  word  is  ocaisrotially  spelled  as  /e>«ngineenng 
Wc  clioosc  die  fonner  spelling  lictc.  bodi  :ue  correct 

A.  Priiduct  Rovngincfring 

Tlie  lenii  a'cngii-jcmig  could  mean  some  sort  of  revvoiiarg  or 
retrofit  of  ;ui  already  engineered  product  This  could  well  be 
iiiieiprcicd  as  nuuniciuincc  or  refurbishment  As  we  leave  noted 
previously  P).  nuiintciuince  cui  be  viewed  fiom  reactive  or 
correcuve.  niicracuve  or  adaptive,  and  proactive  or  perfective 
perspectives  Or,  leciigmeering  could  be  interpreted  as  reverse 
engineering,  ui  wlucli  die  cluiractcristics  of  an  already  cnginecied 
prodna  are  idcnuficd.  such  dual  die  product  can  perhaps  be 
modified  or  reused  hilicrcm  in  Uiese  notions  are  two  major  facets 
of  rccnipnccnng 

1  It  uiiproves  die  pruinct  or  system  dchveied  to  die  user  for 
cnluuiccd  leluibility.  nuunlaimibility.  or  for  an  evolving  user 
need 

2  It  incieascs  uiidersuuidnig  of  die  system  or  product  itself 
Tliiis.  diis  nilciprctauoii  of  rccngmccnng  is  almost  loLally  pnaduct 
focused  Wc  will  all!  ii  product  reengineering 

Tlius.  we  nuglit  say  Uuit 

Prmlui-I  reengmceriiif’  is  the  examination,  study,  capture, 
and  mmUficatiun  of  the  internal  mechanisms  or 
functionality  of  an  exvting  system  or  product  in  order  to 
recomiitHte  it  in  a  new  form  and  with  neiv features,  often  to 
take  aih'antaye  of  neidy  emerged  technologies,  but  without 
major  change  to  the  inherent  functionality  and  purpose  of 
the  sy fieri. 

Hus  dcfinitio:i  indicitcs  duit  prodna  reengineering  is  basically 
stmaunil  rccnginccmig  widu  at  most  minor  clianges  in  purpose 
;uid  .‘iuKtJOiuilily  of  die  prodna  dull  is  reengineered  Tlus 
.•eciiynccrcd  prodna  could  be  integrated  widi  odier  products 
lulling  radicr  different  fiuKUoiuality  dvan  was  die  case  in  the  initial 
deployment.  Tlius.  reciigiiieeicd  products  could  be  used  together 
widi  diis  ai-^'>icntauoiu  to  provide  new  functionality  and  serve  new 
purposes  A  luuiibcr  of  synonyms  for  pnodua  reenguieering  easily 
come  to  mind  Among  dicse  lue:  renewal,  rcfuibishing.  rcworit 
repair,  nuiiiitciuuicc.  modcmi/atioit  reuse,  redevelopment  and 
let  ofit 
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A  specific  exi'inple  of  a  product  rocngiiKcriiig  cflbrt  niiglit  be 
that  of  taking  a  legjicy  s>'stcin  written  in  Cobol  or  Fortnuv  rc\  erse 
engineering  it  to  detenniue  tltc  ^^cin  definition,  and  tltcn 
reengineering  it  in  C’  or  Ada.  I3cpcnding  upon  wlicUier  or  not 
any  modified  user  requirements  are  to  be  iiKxirponucd  into  tire 
reengineered  product,  we  would  eitlier  reenguteer  tlte  product, 
through  a  forward  engineering  effort,  just  after  reverse  engineeriitg 
had  determined  eitlier  die  initial  development  (tecluiical)  ^'stein 
specifications,  or  after  reverse  engineering  far  enaigh  to  detemiiiie 
user  requirements  and  user  specifications,  and  then  updating  diese. 
This  reverse  engineering  concept  [4],  in  which  salient  aspea  of 
user  requirements  or  technological  specifications  ate  recovered 
fio.li  examination  of  characteristics  of  die  product,  predates  product 
reengineering. 

Figure  4  illustrates  product  reenginoering  conceptually.  An 
software  standards  reference  [5]  stales  that  "reengineering  is 
a  complete  process  that  encompasses  an  lUtalysis  of  existing 
applications,  restructuring,  reverse,  and  forward  engineering." 
The  IEEE  standard  for  software  maintenance  [6]  suggests  dial 
reengineering  is  a  subset  of  software  engineering  dial  is  comprised 
of  reverse  engineering  and  forward  engineering.  We  liave  no 
disagreement  with  the  son  of  definition  at  all,  we  prefer  to  call  it 
product  reengineering  for  the  reasons  just  stated.  It  is  also 
necessary  to  consider  reenginoerLng  at  the  levels  of  processes  ruid 
systems  management  if  we  are  to  take  full  advantages  of  the  major 
opportunities  offered  Thus,  the  qualifier  "product"  appears 
appropriate  and  desirable  in  the  oonie.\t  used  here. 
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Fig.  4  Representation  of  Product  Reengineering 
B.  Process  Reengineering 

Reengiiieering  can  also  be  considered  at  die  levels  of  processes 
and  systems  managemenL  At  die  level  of  processes  only,  the  effort 
would  be  almost  totally  intental.  It  would  consist  of  modifications 
to  whatever  standard  lifecycle  processes  are  in  use  in  a  given 
organization  in  order  to  better  aooonunodate  new  and  emcrgiiig 
technologies  or  new  customer  requirements  for  a  ^’Stem.  For 
example,  an  explicit  risk  management  capability  miglit  be 
inoorponttod  at  several  different  pluses  of  a  given  lifccvcle  and 
aooonunod'ited  by  a  revised  oonfignmtion  management  pnxx^ss 


Tliis  could  be  uiiplcmcnicd  into  die  processes  for  RDT&E, 
acquisition,  mid  svsicins  pLuining  mid  maiketing.  Basically, 
reengineering  at  die  level  of  processes  would  consist  of  the 
dctcnniiuilion,  or  s>ndiesis.  of  mi  cfliaicious  pnxess  for  ultimately 
fielding  a  product  on  die  basis  of  a  kiiovv  ledgc  of  generic  customer 
requirements,  and  die  objectives  mid  critical  capabilities  of  die 
sv'stems  engineering  organization.  Our  Figure  5  illustrates, 
conceptually,  some  of  die  facets  of  process  rccngineerijig. 
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Fig.  5  Conceptiiul  Illiislralion  of  Process  Reengineering 

hi  accordance  vvidi  diis  discussion,  we  offer  the  following 
definition. 

Process  reengineering  is  the  examination,  study,  capture, 
and  modification  of  the  internal  mechanisms  or 
functionality  of  an  existing  process,  or  systems  engin^ring 
lifccyde,  in  ortkr  to  reconstitute  it  in  a  neiv form  and  with 
neiv  features,  often  to  take  advantage  of  newly  emerged 
caiKtbUities,  but  without  changing  the  inherent  functionality 
and  purpose  of the  process  itself that  is  hang  reengineered. 
Wc  could  reengineer  eidier  die  process  for  RDT&E,  ^em 
acquisition  or  production,  or  systems  planning  and  maiketing 
Among  die  first  disaissions  of  diis  sort  of  effort  at  business  process 
reengineering  aldioiigli  die  word  redesign  was  used  rather  than 
reeiiginecriiig  is  in  a  coirtemporaiy  paper  by  Davenport  and  Slwrt 
I7|.  Tliis  was  greadv  c.\pmided  upon  in  a  recent  and  seminal  text 
by  Davenport  [81  which  docs  make  use  of  the  term  reengineering. 
We  will  provide  mi  ov  erview  of  diis  major  work  in  a  later  sectiore 

Tliese  mid  oUicr  aiiUiors  recognize,  of  course,  dvat  redesign  of 
processes  only  mid  vvidioiii  auention  to  reengineering  at  a  higher 
level  dimi  processes  onlv'  niav,  in  nimiy  instances,  iqiresent  an 
incomplete  mid  not  fullv  satisfaaorv  way  to  improve  organizational 
Gipabililics.  Tlius.  die  process  considered  as  candidates  for 
recnginccriiig  are  liigli  level  mmiagerial  as  well  as  operational 
processes,  hifonivilion  iccluiology  is  considered  to  be  a  major 
enabling  ciUilyst  for  process  reengineering.  Continuous  process 
improvement  mid  institutionalization  is  advocated. 

Rcengiiioering  at  die  process  lael,  and  die  resulting 
iniprovemciit  in  product  Uvit  results  from  improvement  in  die 
product  line,  is  essentially  vvkit  die  AT&T  Bell  Laboratories  lias 
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used  to  identify  a  ooiuinon  tayloniblc  s>'steins  acquisition  process 
for  Federal  Systems  Adv'ancod  Tccluiology  (FSAT)  use  :ts  a 
deployment  metliodolog)  for  all  Process  Management  Tcuus 
(PMTs)  at  this  organization  |9].  Hie  indiaited  benefits  to  using  a 
common  and  understood  lifoc}'cle  include. 

■  shorter  development  cycles, 

•  fewer  engineering  change  orders, 

■  products  that  fulfill  customer  e.\pectations,  and 

■  reduced  program  and  product  det'elopment  costs  tiuougliout 
the  lifecycle. 

Thus,  the  process  improvement  results  ultimately  in  an  increase  in 
effectiveness  rtf'  product  for  the  same  cost,  or  a  reduction  in  cost  for 
the  same  effectiveness,  or  some  blend  of  these  two. 

Essentially  what  we  call  process  reengineering  here  is  tenned 
"domain  engineering'  b>'  the  Software  Productivily  Consortium 
(10].  The  combiruition  of  domain  engineering  witli  die 
"application  engineering"  or  lifecycle,  effort  needed  to  product  die 
actual  product  is  termed  "synthesis."  It  is  intended  for  use  in  die 
systems  acquisition,  procurement,  or  production  lifecycle.  Figiue  6 
illustrates  the  syndiesis  cxinoept  which  is  intended,  in  part  to 
&cilitate  the  incoiporation  of  reusable  software  products  in  new 
software  systems.  Tlie  ^thesis  process  proceeds  as  follows 


- Synthesis - 

Daman  Engineering -  •^—Application  Engineering - *■  | 


Fig.  6  Tile  Software  Productivity  Consortium  SyiiUiesis  Process 


1.  A  set  of  6  domain  engineering  plvises  is  used  to  develop  die 
product  line,  hi  die  first  pliase,  process,  or  product  line, 
planning  is  accomplislied.  Hiis  is  presumably  based  on 
knowledge  of  die  organization  <uid  its  critical  core 
capabilides,  and  customer  needs,  as  illustrated  in  Figiue  5. 

2.  This  is  followed  by  a  process  definitioiuil  pliasc.  aided 
product  line  analysis  by  SPC,  in  wliicli  die  requirements  for 
the  process  or  product  line  are  specified. 

3.  This  leads  to  product-line,  or  lifecycle  process,  dcveloptiKiit. 

4.  A  modeling  and  simulation  environment  is  ne.\t  oonstnictod 
such  diat  it  will  be  possible  to  accomplish  prototyping  in  die 
actual  lifecycle  for  production  of  die  product 

5.  A  product  library  is  ne.vt  constructed.  Tliis  is  comprised  of 
reusable  software  modules  and  code  generators  wliich  can 


produce  e.\ccuuible  code  from  a  set  of  input  technical 
sixxrifiailions.  A  code  generator  aui  be  viewed  as  a  special 
type  of  reusable  software  product. 

6.  A  process,  or  produa  line,  test  and  evaluation  fiidlity  is 
coiistnictcd  nevl.  Tliis  Ivis  die  capticity  for  test  and  evaluation 
of  products  fi^oiii  die  product  line.  Following  this  phase,  die 
applications  engineering  lifecycle  begins.  Tliis  involves 
tictual  definitiort  development  and  deployment  on  the  basis 
of  die  just  engineered  produa  line,  or  process,  lifecycle. 

7.  Produa  definition  is  die  first  pliase  in  aaual  production  of  the 
softw"<ue  product  Tliis  is  acliieved  by  identifying  die  user 
requirements  for  a  software  produa  and  translating  diem  into 
a  set  of  ^lecifications 

8.  Produa  prototyping  occurs  in  die  second  pliase  of  the 
applications  engineering  lifecycle.  Here,  a  prototype  is  built 
and,  widi  user  interaction  presumably,  used  to  refine  die 
technical  specifications  for  die  software  product  The 
modeling  and  siniulation  environment  built  earlier  in 
donviin  cnginocritig.  is  used  for  diis  purpose. 

9.  Nav  produa  development  ocairs  nevt  In  this  particular 
insuuice.  diis  refers  to  die  production  of  custom  built 
c.'icaitable  code  for  diose  portions  of  die  software  produa 
dial  <ue  not  to  be  comprised  of  reusable  code. 

10.  Reuse  dev'clopnient  occurs  ne.\L  In  this  phase,  the 
reengineered  code  dial  lias  become  reusable  code  and  code 
produced  by  applications  generators  in  phase  5  of  domain 
engineering  is  integrated  in  widi  the  customized  code  to 
result  in  a  complae  fiuictioning  product 

11.  hi  die  final  pliase,  integration  and  test  of  this  fiuictional 
software  ocairs.  Tliis  generally  involves  use  of  die  test  suites 
produced  by  die  produa  line  test  and  evaluation  fecility. 

12.  Tlic  deployed  produa  results  fiom  this  effort. 

As  indiaited  in  Figure  6,  diere  are  several  entry  points  fitim 
domain  engineering  to  appliaition  engineering  as  several  of  the 
results  of  iitleniicdiatc  plutses  of  domain  engineering  (at  phases  5. 
5,  and  8)  tire  used  in  die  aaual  lifecycle  diat  is  developed  as  a  result 
of  domain  ciiginecring 

C.  Reengiiu'criiig  ;it  the  Lxji  cl  of  Systems  Management 

At  die  level  of  systems  iiituiagenient  reengineering  is  directed  at 
potential  clnuigc  in  all  business  or  organizational  processes, 
including  die  systems  acquisition  process  lifecycle  itself  Many 
audiors  Ivavc  disaisscd  reengineering  die  coqxitatioa  Aigu^ly, 
die  airlicst  use  of  die  tcnii  hu.'sincss  reengineering  was  by  Hammer 
1 1 1 1,  ill  1990.  raid  more  fully  doainiented  in  a  more  recent  work  on 
Reengineering  the  Corporation  ( 12|.  Tliere  are  a  small  plediora  of 
related  works,  its  we  will  soon  disaiss. 

Haniniei'‘s  dcfinilion  of  reengineering  "Reengineering  is  the 
JiiiKlainental  rethinking  and  radical  redesign  of  business  processes 
to  achiew  dramatic  improvements  in  critical,  contemporary 
measures  of  fxirjbnnance,  such  as  cost,  quality,  service  emd  speed' 
is  a  dcfinilion  of  wiiat  we  will  call  reengineering  at  die  level  of 
systems  mraiiigemcnt.  Tliere  arc  four  major  teniis  in  diis  definition. 

■  Fundamental  refers  to  a  Iragc  scale  and  broad  scope 
CNaniinaiion  of  virtiuilly  CNcrydiing  about  an  organization  and 
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how  it  operates.  Tlie  purpose  is  to  idciilifv  potential  wcikucss 
that  are  in  need  of  duignosis  and  correction. 

■  R/uScal  redadgn  suggests  disregarding  existing 
oiganizalioital  processes  rutd  structures,  aitd  inventing  totally 
new  ways  of  acooinplisliing  work. 

■  Dramadc  improvements  suggests  tltat,  in  Hanuner's  view, 
reengineering  is  not  about  making  marginal  and  incnemcnL'il 
improvements  in  the  status  quo.  It  is  about  making  “quantum" 
leaps  in  organizational  perfonnanoe. 

■  iVocesses  represent  the  collection  of  activities  that  are  used  to 
take  input  materials,  including  intellectual  inputs,  and 
transform  them  into  outputs  and  services  that  have  value  to 
the  customer. 

Hammer  suggests  that  reengineering  and  revolution  are  almost 
synonymous  terms.  His  identification  of  the  three  types  of  firms  dial 
attempt  reengineering  -  tlrase  in  trouble,  tlrose  who  see  uoubic 
coming,  and  those  who  ate  ambitious  atKl  seek  to  avoid  impending 
troubles. 

He  indicates  that  one  major  catalyst  for  reengineering  is  tl£ 
creative  use  of  information  technology.  Reengineerittg,  is  not  jtrst 
automation  however,  it  is  the  ambitious  and  rule  breaking  shtdy  of 
everything  about  the  organization  to  enable  nxrre  effective  and 
efficient  organizational  processes  to  be  designed. 

We  essentially  sliare  tliis  view  of  reengineering  at  tire  level  of 
^sterns  managemenL  Our  definition  is  similar. 

Systems  management  reengineering  is  the  examination, 
study,  capture,  and  ttwdificadon  of  the  internal  mechanisms 
or  JuruAmality  of  existing  system  ttumagemeta  processes 
and  practices  in  an  organization  in  order  to  reconstitute 
them  in  a  new  form  and  nith  new  features,  often  to  take 
advanb^e  of  newly  emerged  organizational  competitiveness 
requirements,  but  without  changing  the  inherent 
ftatetionaUty  and  purpose  of  the  organization  itself. 

We  make  no  representation  tliat  tlris  definition,  or  tire  otlrer  two  for 
that  matter,  cf  reengineering  is  at  all  tire  sanre  across  tire  ttrany 
works  that  we  discuss.  Figure  7  represents  tlris  cotrcepliotr  of 
reengiiteeting  at  tire  level  of  systerrrs  mairagenrerrt 
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Fig.  7  Reengineering  at  the  Level  of  Systems  Manageirrent 


Lifecycle  process  rccnginccriiig  ocairs  as  a  natural  byproduct  of 
rccngirrccriirg  at  Ute  level  of  svstenrs  nriUvrgenrenL  Tlris  may  or 
irrrry  not  result  iir  lire  recirgitrccnirg  of  already  existing  products. 
Generally,  it  will  as  new  products  mid  competitive  strategies  are  a 
major  underlying  objective  of  reengineering  at  tire  level  of  systems 
maivigenicnt,  or  orgtuuzatioiuii  neenginoering  as  it  is  more 
oonunonJy  called. 

D.  Perspectives  of  Reengineering 

Tlris  very  brief  discussion  of  reengineering  suggests  that  we  can 
coitsider  reengineering  at  tliree  levels;  systems  management, 
lifecycle  processes,  and  product.  Tire  major  purpose  of 
reengineering,  r^aidless  of  wlietlier  it  is  at  tire  product  level  or  the 
process  level  or  the  level  of  systems  management,  is  to  enable  us  to 
product  a  better  product  for  tire  same  cost,  or  a  lower  cost  product  at 
a  cost  con^rarable  to  that  for  tire  initial  product  Thus  it  improves 
competitiveness  of  tire  organization  in  coping  with  changing 
e.xtenral  situations  and  environments.  We  may  approach 
reengineering,  at  any  or  alJ  of  tlrcse  levels,  fiom  either  of  three 
perspocth'es: 

■  reactive,  because  we  rcstlizc  tlurt  we  arc  in  trouble  and  perhi^ 
in  a  crisis  siluatioiv  and  reengineering  is  one  way  to  bring 
about  needed  cirange; 

■  tn/esracriw',  because  we  vvislr  to  slay  abreast  of  current  changes 
as  tliey  evolve;  or 

■  proactive,  because  we  wish  to  position  our  organisation  now 
for  changes  that  we  believe  will  occur  in  the  fiituic,  and  to 
emerge  in  tire  clrairged  situation  as  a  market  leader. 

In  our  next  section,  we  examine  some  of  the  many  oontemporaiy 
ideas  that  have  been  expressed  about  tire  subject  of  reengineeting. 

n.  AN  OVERVIEW  OF  REENGINEERING  APPROACHES 

As  we  Itave  noted,  it  is  possible  to  consider  reengineering  at  the 
levels  of  strategy  and  systems  nwnagenient.  process  or  product  line, 
or  product.  In  this  section,  we  provide  an  expanded  overview  of 
each  by  means  of  an  ov  erview  of  conlemporaiy  literature  on 
reengineering. 

Witliont  questioiv  niorc  Irtis  been  written  about  reengineering  at 
lire  levels  of  strategy  or  sy'stems  management  tlvat  at  the  other 
levels.  Tlris  is  not  unrerisonable  since  reengineering  efforts  at  tire 
level  of  organizaiionjil  strategy  liave  direct  implications  for 
niiuvagemcnt  at  Urc  levels  of  nnuriigenrcnt  control,  and  thence  at 
tire  levels  of  process  to  iirrplcirrent  irrairagenrent  controls,  and 
product. 

It  is  difficult  to  uace  lire  origins  of  reengineering  in  any  unique 
nuuincr.  Tire  notions  of  reengineering  are  very  interrelated  with 
tlrose  of  sv'slems  engineering  infonnation  teclinology,  strategic 
planning  and  mmiy  oilier  subjects.  The  term  systems 
nvuiagemenl  reengineering  is  not  at  all  common.  Others  liave 
used  such  related  tenns  ;is  systems  reengineering  organizational 
teenginoering  corporate  reengineering  and  business  process 
improvement.  Tliesc  terms  guv  however,  be  inteipieted  to  mean 
reengineering  at  lire  level  of  system  or  product,  reenginoering  at 
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the  levd  of  lifecycle  pieoess,  and  of  Uie  stnicture  and  fiuictionaiity 
oftlieorganizatioiL 

A.  Business  Process  Improvement 

In  1992,  Harrington  publistied  a  seminal  work  on  business 
process  improvement  ( 13|.  Tlie  major  diesis  of  die  work  is  duit  it  is 
business  and  manu&cturing  processes  that  are  die  key  to  error-fiee 
perfonnanoe.  His  view,  that  the  process  is  the  problem  and  not  die 
employees,  is  essentially  that  of  Deming  and  others  in  die  TQM 
areas.  Harrington  defines  a  prooess  as  a  group  of  activities  that 
take  inputs,  adds  value  to  them,  and  produces  an  output  diat  is  in 
siqipoit  of  an  organization's  objectives.  Two  generic  types  of 
processes  are  identified.  Production  processes  are  direcdy 
oonoemed  with  yielding  the  output  product  or  service.  Business 
processes  si^jpoit  production  processes.  These  include  design  of 
production  processes,  payroll  processes,  and  engineering  cluuige 
processes. 

Matiy  works  on  reengineering  recognize  a  dichotomy  between 
organizational  processes  and  organizational  functionality  Most 
organizations  are  structured  into  vertically  functioning  groups,  or 
hierarchies,  and  most  processes  are  organized  into  liorizonial 
phases  for  work  flow.  The  many  waterfitll  lifecycle  models  we  luve 
illustrated  in  our  effort  surely  demonstrate  diis.  Three  desirable 
atinbutes  of  Business  Process  Improvement  (BPI)  are; 

■  efficiency,  in  terms  of  minimizing  die  cost  of  the  resauces 
used; 

■  in  terms  of  producing  desired  results;  and 

■  adaptabUity,  in  tenns  of  fiedbility  in  accontmodaung 
changing  customer  and  organizational  needs. 

To  do  this  requires  processes  widi  the  following  well  defined 
characteristics:  ownership  and  accountability,  boundaries  imd 
scope,  interfiioes  and  responsibilities,  work  tasks  and  training 
requirements,  measurement  and  feedback  controls,  aistomer 
related  measurements  and  targets,  cycle  times,  and  fonnalized 
change  procedures.  Figure  8  illustrates  die  BPI  process. 


Initiate  Business  Process  Improvement 


Fig.  8  Interpretation  of  Business  Process  Reengineering 


PbuTington  suggests  five  plvases  for  business  prooess 
improvenient  (BPI).  Tlicsc.  represented  in  Figure  8,  are  as  follows. 

1.  Or^aniv:  for  lmimivenu:nt  -  This  pliase  involves  several 
steps.  First,  it  is  necessary  to  estiblisli  an  Executive 
Improvement  Team  (EIT).  Tlien  a  BPI  champion  is 
appointed  and  eveaitivc  training  is  provided  An  nomiativ'e 
iinprov'emeiu  model  is  identified  and  BPI  objectives  are 
conununicated  to  employees.  Ne.\1,  it  is  necessary  to  review 
business  strategy  and  anticipated  customer  requirements. 
Tliis  enables  die  organization  to  identify  and  sdect  critical 
processes  for  improvement  and  to  appoint  prooess  owners. 
Performance  Improvement  Team  members  are  selected 

2.  Develop  an  Understanding  of  the  yarious  Fiocesscs 
Currently  in  Use  -  Tliis  pliase  includes  defining  the  scope 
and  objectives  of  current  processes  and  the  boundaries  within 
vvliich  diey  are  functional.  This  UKludes  a  variety  of  analysis 
fiuictions.  including  smeturing  current  processes  and 
dctociioii  and  diagnosis  of  areas  for  potential  improvement. 

3.  Streamline  Organizational  Processes  for  Enhanced 
Efficiency,  Effectiveness,  and  Adaptability  -  Current 
processes  are  corrected  duougli  various  streamlining 
approaclies  diat  tire  responsive  to  the  diagnosis  peifonned  in 
die  last  pliase.  Tliis  includes  eliminating  bureaucracy, 
reducing  die  opportunity  for  errors,  reducing 
lion-value-added  activities,  and  otherwise  simplifying 
processes  such  as  to  reduce  cost  and  increase  eflectiveness. 

4.  Implement  a  Program  of  SystenuOic  Measurement  arut 
Controls  -  In  diis  pliase,  a  program  of  systematic 
mciisurements  is  used  as  a  quality  control  and  monitoring 
system  in  order  to  bodi  maintain  die  new  prooess  productivity 
iuid  to  keep  finm  r^ressing  into  poorer  process 
implementations  only  because  diey  liave  been  used  in  the 
past  An  extensive  program  of  measurements,  feedback  and 
action  is  suggested  Tliis  includes  audits  of  what  are  denoted 
as  PiHir  Quality  Cast  (PQC)  facets 

5.  Continue  the  Evolutionary  Improvement  -  Periodic  reviews 
of  die  cOort  are  used  to  enable  detection,  diagnosis  and 
correction  of  difficulties  as  improvement  continues.  A  fomial 
prognuii  of  business  prooess  qualification  is  suggested 
dirougli  use  of  a  sl\  level  PBl  scale  for  prooess  maturity 
status.  Tlic  levels  arc  luiknovvn.  understood  effective, 
efllcient  error-free,  and  world  class. 

Harrington's  clfort  lias  much  in  common  with  diose  in  strategic 
qiuilitv  miuiagenient  [1|  |14|.  Tlie  work  lias  much  relevance  to 
bcnchiuarking.  a  topic  we  will  soon  discuss,  and  to  systematic 
measurements.  Tlie  audior  lias  had  much  experience  in  efforts  of 
diis  sort.  <uid  diis  is  much  in  evidence  in  liis  high  cahber  writing. 

B.  Intelligent  Enterprise 

Tlic  efforts  of  James  Brian  Quinn  are  focused  on  knowledge 
based  services  as  a  necessary  compliment  to  manufacturing  efforts. 
His  view  of  lui  organization  is  basically  that  of  a  collection  of 
scAice  activities  ;uid  diat  bodi  service  and  product  oriented 
orgiuibations  will  obtain  dicir  major  competitive  advantages  not 
from  superior  phvsiail  Cicilities  ;uid  nvaterials  alone,  but  from 
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knowledge  and  service  txiscd  capabilities.  In  iliroc  recent  works 
[IS]  [16]  [17]  concerning  teclutologies  in  services,  tire  claim  is  set 
forth  that  it  is  services  and  not  manufacturing  activities  tiiat  provide 
the  major  course  of  value  to  consumers.  Tliis  does  not  suggest  t)t:it 
snvices  replace  ntanufiicturing.  but  ratlrcr  tltat  if  it  were  not  for  tlte 
value  added  by  the  services  tltat  are  associated  witli  a  manuhictuied 
product,  there  will  be  fiir  diminished  value  to  the  product  itself 
Several  characteristics  of  new  organuations  ate  cited: 

■  "infinitdy  flat,"  or  horizontal; 

■  "spider's  web"  like,  or  non-hieiarchical  and  with  liiglily 
networked  interconnections; 

■  "hollow  corporations,"  in  witich  outsourcing  crf'botii  products 
and  services  becomes  an  increasing  reality; 

■  demolished  bureaucracies  and  vertical  integration;  and 

■  "intdlectual  holding  companies,"  in  which  inlellectiuil 
tedinologies  are  critical. 

It  is  represented  that  this  will  lead  to  precise  and  swifl  strategv' 
execution,  the  leveraging  and  retention  d'key  people,  and  "creative 
management  for  profits. " 

A  1992  text  by  Quinn  [18]  represents  a  definitive  iittegration 
and  ^mthesis  of  earlier  eflbrts  on  tliis  subject  It  is  much  concerned 
with  concentrating  organizational  strategy  on  core  inlelloaual 
con^ietendes  and  core  service  competencies.  He  suggests  four  k^ 
rules  to  follow  in  order  to  generate  success  in  this  regard. 

1.  Focus  internal  organizational  resources  on  those  relatively 
few  basic  sources  of  intellectual  strength  and  service  strengtli 
that  will  create  and  sustain  a  teal  and  meaningful 
distitKtiveness  to  tlie  customer  over  the  long  term. 

2.  Approach  the  remaining  capabilities  as  a  non  critical  set  of 
services  activities  which  may  be  supplied  internally  or 
outsourced  fiom  external  suppliers  win  compete  well  in 
functional  activities  related  to  these  capabilities. 

3.  Sustain  success  by  building  entry  barriers  around  tliose 
sdected  critical  cote  capabilities  to  prevent  a  competitor  from 
assuming  a  substantial  market  position 

4.  Plan  and  control  outsourcing  such  as  never  to  become  eitlier 
dependent  upon  or  dominated  1^,  external  suppliers. 

Strategic  sourcing,  including  outsourcing,  is  a  major  ingredient  in 
these  maxims,  we  discuss  it  later.  Appropiiale  interfaces  between 
production  efforts  and  service  efforts  are  also  suessed  as  is  tire 
management  of  knowledge  based  irrtellect  and  profession'll 
intellect 

Seven  types  of  innovaUve  organizations  are  identified  in  tliis 
work. 

1.  Basic  research  organkations  support  large  ROT&E  imits. 
They  select  products  for  development  on  tire  basis  of  aireful 
and  conservative  tradeofis  among  risk  and  potential  profit 

2.  Large  system  producers  develop  large-scale  ^steins  tluit 
generally  cost  a  great  deal  and  which  must  perfonn  in  a 
reliable  manner  for  a  very  long  time. 

3.  Dominant  market  share  oriented  companies  are  oficn  not 
the  first  to  introduce  an  emerging  technolog)'  into  tlic 
marketplace.  Tlicy  ollen  support  large  research  luiits  iurd 
plan  market  entry  and  product  evolution  to  obuiin  nuxinnun 


penetration,  decreased  risk,  gretu  product  reliability,  and 
lower  overall  costs 

4.  State  of  the  art  tedawlogy  development  con^Huties  are 
preeminent  in  tlicir  knowledge  of  a  specific  technology, 
Ollen  tliey  operate  in  markets  in  which  technical 
perfomuuice  criteria  are  tlie  major  driv'ers  of  deitiand  and  are 
Uierefore  able  to  selfdefinc  tire  cliamcteristics  of  next 
generation  lecluiology. 

5.  Discrete  freestanding  product  line  companies  form  strong 
research  divisions  and  act  as  entrepreneurial  units.  They 
depend  more  on  technology  push  for  tlieir  initiatives  than 
denvuid  pull,  and  will  often  introduce  new  products  on  a 
small  scale  and  obtain  real  time  market  test  rather  than  pay 
for  very  expensive  marketing  studies  dial  may  not  be  as 
effective  as  small  scale  product  inuoduction  and  interactive 
modification  of  die  product  to  meet  consumer  nee 

6  /Jmited  volume  or  fashion  companies  provide 

mid  limited  qutuiiity  products  to  a  speciahzed  markc  .^tie. 

7.  Job  shop  or  cusbmt  doagn  companies  provide  one  of  a  kind 
products  dvit  meet  mi  individual  customer's  requirements. 
Tliese  may  often  involve  flexible  manufecttiiing  or  mass 
aistornizcition  approacltcs  to  emible  liiglily  specialized  design 
to  met  mi  individual  customer  requirements. 

Clearly,  diis  is  not  a  mutually  e.xclusive  listing  Nor  is  the  lisdny 
collecdvely  exlvaustive.  Each  of  die  industry  types  may  be 
associated  vvidi  mi  organizational  strategy  and  configuration  that  is 
tuned  to  providing  mtixirmuii  success  opportunities.  Each  type  will 
tiave  a  different  propensity  for  organizational  growth,  maturity, 
decline,  mid  rebirdi  as  we  discussed  in  our  last  diapter.  In  his  effort, 
Quirui  discusses  die  typical  product  lifecycle  for  each 
organizatiomd  cluuaaeristics.  die  mix  of  attributes  that  describe 
typical  iruiovadons,  mid  recommended  organizational  structures  for 
ROT&E,  acquisidon  or  production,  planning  and  marketing  and 
interfaces  between  marketing  and  customers.  Approaches  for 
managing  die  intelligent  enterprise  are  also  suggested.  These 
involve  efforts  diat  also  encompass  core  capabilities,  outsourcing 
total  quality  rniuiagemcnt.  mid  bencluiiarking  as  well  as  some  of 
die  odicr  ingredients  illusmitod  in  Figure  8.2. 

C.  Process  Innovation 

In  die  afoneiiotod  text  by  Davenport  [8],  a  careful  distinction  is 
m;idc  bctwocii  process  improvement  mid  process  innovadon  His 
fiuidiuiicnutl  disdncdoiis  arc  diat  improv'enient  is  continuous  and 
incremental  in  nature,  deals  widi  die  existing  process,  can  be 
accomplislicd  in  a  reladvely  sliort  time,  is  a  bottom-qi  activity,  and 
is  a  iimrow  scope  effort  widi  reladvely  moderate  attendant  risks. 
On  die  odicr  Ivuid.  innovadon  is  generally  a  discrete  phenomenon 
dial  is  rcvoludonmy’  mid  radical  in  nature.  It  starts  with  a  zero  base 
as  coiiu-tistcd  widi  die  exisdng  process,  can  only  be  accompli^ied 
over  a  long  dine,  is  a  top-down  activity,  is  a  broad  scope  effort  that 
aits  across  all  of  die  ftuicdonal  areas  and  processes  in  die 
orguii/adoii.  mid  is  usiuilly  climaaerized  by  high  associated  risks. 

Of  course,  diere  may'  exist  quesdons  of  wliedier  an  innovadon  is 
a  major  improvement  mid  vvliedicr  a  set  of  incremental 
improvemcnis  docs  not  add  up  to  mi  innov'adon.  Radier  than  a 
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binaiy  scale  to  separate  these  t\>o,  peiikaps  it  would  be  best  to 
consider  a  continuous  scale.  This  would  enable  us  to  consider 
incremental  improvement  at  one  end  and  ntdical  innovabon  at  tlie 
other.  Obviously,  eitiier  can  be  appropriate  or  inappropriate  in 
specific  situations.  Change  increments  may  be  so  small  tiun 
centuries  would  be  required  to  aixomplisli  any  clituige  wrtli 
appreciable  value  added.  On  the  other  liand  radical  iruiovation  rntiv 
be  so  dramatic  that  tire  organization  is  ailturally  and  otlicrwisc 
unable  to  adapt  Culture  shock  and  customer  revolt  is  otlcn  tlic 
result 

Information  techrvslogy  is  suggested  as  a  major  enabler  of 
process  innovation,  together  with  the  organicitional  and  hiunan 
enabler,  and  an  enabler  based  on  measurements  assodatod  vvitli 
process  infixnration  and  management  of  tlie  infonnation 
environment  Essentially  these  three  were  suggested  as  enablers  in 
our  Figure  1.  The  process  innovation  process  itself  is  comprised  of 
S  phases,  and  a  number  of  activity  steps  within  each  pliase,  as 
connoted  by  Figure  9.  We  note  that  tliete  are  opportunities  bolli  for 
traditional  irictemental  improvement  and  tlie  mote  eMremc 
innovative  uiiprovement  Presumably,  tlie  overall  process  is 
iterative  and  this  leads  to  continual  innovation,  as  suggested  bv'  ilic 
feedback  fiom  phase  V  to  pliase  I  in  the  illustration  of  Figure  9. 
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Fig.  9  Tile  Process  of  FYocess  liuiovatioii 

Not  explicitly  shown  in  this  figure  is  tlic  oigiuiizatioivil 
communications  tliat  occur  at  each  phase  of  tlie  eflbrt  and  tlie 
eommitment  building  that  is  also  needed.  Davenport  examines  a 
number  of  enablers  of  specific  processes  iiKlnding:  plmuiing. 
research,  development,  design,  pioductioa  marketing,  and  sales 

Also  considered  briefly  are  the  role  of  organizational  culture  in 
shaping  desirable  innovation  strategies.  In  a  related  work  [  19).  tliis 
is  considered  in  greater  detail.  Five  models  of  infomiation  ailture, 
denoted  as  information  politics  in  the  paper,  are  defined. 

1.  Technocratic  Utopianism  is  a  fomial  anal>'tical  appixxich  to 
infonnation  management  It  stresses  liiglily  quantiuitive 
aiprooches.  major  leUanoe  on  emerging  technologies,  :uid 
full  infonnation  assets.  Tlicie  is  an  luidcrlying  assiuiipiioii 


Ukit  iccluiologv  will  resolve  all  probicnis.  Organizationnl  and 
polilicil  issues  are  geiKnilly  ignored,  perhaps  because  tliev 
;irc  considered  luuituvigciiblc  In  re;ilit>,  tlie  major  focus  ma>' 
be  on  ilic  icclinologics  used  to  process  data  rather  tlian  on  die 
infonieiuoii  oonieiii  ui  die  diut  Tlie  major  objective  is 
modeling  and  use  of  orgiuiizauonal  data.  Data  engineering, 
and  some  aspects  of  infonnation  engineering  is  oonsidcied 
king  All  diat  is  really  needed  for  organizational  paradise  is 
die  selection  of  die  best  hardware  and  software,  and  use  of 
die  most  appropriate  CASE  tools  to  assist  in  oonstnicting  the 
most  appropriate  endty  relationsliip  diagrams  and  data  flow 
models. 

2.  Feuilalism  was  die  model  of  information  politics  most  often 
enocHuitcred  in  diis  stiidv'.  In  diis  model,  a  number  of  people 
ill  feudal  depeutments  individually  control  infonnation 
aaiuisttioit  represeiilaboa  storage,  analysis,  tiansmissioa 
iuid  use.  Tlie  Luigiiage  used  for  infomiation  representation  is 
often  diflcieiii  across  die  various  departments.  Critical 
iiifomiaiioii  dial  alTocts  die  orgiuiizadon  as  a  whole  is  often 
not  collected  Wlicii  a  subset  of  diis  is  collected  by  ai' 
iiidividiLal  dcpartiiiciit.  it  may  not  be  passed  on  outside  the 
boundaries  of  die  dcpaitmeiiL  Making  organizationally 
iiifoniicd  decisions  for  die  oonunon  good  is  often  very 
diffiailL  as  a  result  of  the  information  poor  environmenL 
Sometimes,  however,  strategic  alliances  across  dqnrtments 
are  possible  and  innovation  within  these  units  may  be 
possible. 

3.  Monarchy  is  a  pragnvatic  solution  to  the  diflScuities  inlieient 
in  die  feudid  model  hifomiation  management  is  centralized 
:uid  diete  is  Hide  autonomy  concerning  information  policies. 
A  benign  monarch  wiio  is  enlightened  concerning 
infonnation  tccluiolog)'  and  organizational  needs  for 
iiifomialioii  may  set  up.  potentially  through  a  Chief 
liifoniiatioii  Officer  (CIO),  a  very  effective  and  efficient 
s>'stcin.  A  constitiitionaJ  monarcliy  may  well  accomplish 
diis;  a  despotic  moivirciiy  seldom  will  perform  satis&ctorily. 
An  ExcaiUve  Infonnation  S>'stem  (EIS)  or  Executive 
Support  Svslciii  (ESS)  1 20)  may  well  be  a  beneficial  product 
of  a  constilutioiiiil  moiiiircliy.  One  problem  with  a 
coiistiiutioiial  iiironiiatioii  monarchy,  liowever,  is  the 
problem  of  succession  w  hen  die  nionareh  dies,  retires,  or  is 
ovcrdirown.  Tliis  cm  Icid  to  iiifonivttion  anarchy. 

4.  Anarchy  is  die  result  of  complete  absence  of  any  information 
poliev'.  Usiuilly  diis  is  not  a  willfully  imposed  model  but  a 
result  of  a  breakdown  of  one  of  the  centralized  approadies, 
such  as  a  moivircliy.  Tliis  results  in  individuals  and  units  in 
an  organiziitioii  maiiagiiig  their  own  informadon  resources 
iuid  developing  infonnation  reports  diat  serve  their  own 
needs  radicr  dian  die  organizadon  as  a  whole.  Tlie  major 
shortcoming  of  <m  infomvition  anarchy  is  not  the  wasted 
effort  involved  in  die  lediuickmt  infonnation  processing  and 
storage  effort  across  units,  but  die  tower  of  Ribcl  effect  diat 
results  in  tcmis  of  providing  infomiation  diat  is  useful  for  the 
entire  organi/ation.  Tlicrc  will  seldom  exist  any 
iiilcropcrability  across  luiits.  at  die  level  of  eidier  data  or 
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infoniialioii.  As  a  coiiscquciKe  of  Uus,  die  various  units  iiuiy 
well  present  entirely  diflerent  results  from  using  llic  suite 
data  on  the  gcncnilly  veiy  dilTcrent  alternative  model 
management  systems,  including  intuitive  model 
management  SNStems,  tliat  will  be  in  use. 

S.  Federalism  im'oh  es  Uie  use  of  negotiations  (2 1 1  (22)  to  bring 
competing  and  noncooperative  individuals  and  units  into 
consensus  on  infonnation  reLated  issues.  Strong  ocnlrat 
leadership  and  an  organizational  culture  tliat  encourages 
learning,  cooperation  and  consensus  are  required  in  order  for 
this  model  to  work.  Understanding  by  top  organizational 
management  of  both  infonnation  tecimologies  and  tlie  value 
of  information  is  a  need  if  a  shared  information  vision  is  to  be 
created. 

b  seems  quite  clear  that  these  models  are  neither  mutually 
exclusive  lur  coUectivety  exhaustive.  A  given  organization  may 
wdl  function  with  a  hybrid  model  and  tlie  model  tliat  is  in  use  at  a 
given  time  may  evolve  fixim  one  fomi  to  the  other  witli  clianges  in 
organizational  dyitamics.  Some  of  die  models  of  organizational 
cultures  discussed  in  [1|  and  (2|  seem  also  very  plausible  as 
information  cultures  as  well. 

In  a  study  of  the  information  cultiue  at  25  companies, 
Davetq»n  and  his  oolleagu^^s  found  dial  die  feudal  model  was  die 
most  oommon  model  in  12  companies.  Federalisl  mcmarcliisl  and 
technocratic  utopianist  models  were  found  with  appraximateiy 
equal  fiequency  in  diat  8,  7,  and  9  companies  were  found 
predominantly  following  die  prescriptions  of  these  models.  Only  4 
organizations  followed  die  anarchist  model.  Tliey  also  ranked  diese 
five  alternative  models  of  informauon  cultures  according  to  their 
scoring  on  four  attributes  of  information  value: 

■  commonality  of  vocabulary, 

■  access  to  infonnation, 

■  information  quality,  and 

■  information  management  efficiency. 

Each  of  the  four  attributes  had  equal  weight  in  diis  study.  Tlieir 
conclusion,  that  the  federalist  and  monareliist  models  are  most 
appropriate,  in  most  situadons,  seems  inescqiablc.  Tlie  feudalist 
and  anarchist  models,  wliich  actually  seem  to  liave  a  lot  in 
common,  were  die  poorest  performers. 

In  order  to  detemiine  a  "best"  model  for  a  given  organization, 
we  need  to  know  die  current  model  dial  is  in  use,  and  die 
evolutionaiy  or  revelutionaiy  model  for  die  infonnation  culture  to 
which  the  organizadon  sliould  be  moving.  Four  suggestions  are 
given;  our  interpretation  of  diese  is  a  s  follows. 

1.  Match  information  management  strategies  to  die 
organizational  culture  e.\iaiit  and  die  organizational  ailtwc 
the  organizadon  is  desirous  of  adopting. 

2.  Praedee  technological  realism,  bodi  in  temis  of  infoniiaUon 
technologies  diemselves.  interoperability  across  potentially 
different  platforms,  and  (he  value  of  information  diat  is  not 
always  easily  captured  in  elecbonic  fonn. 

3.  Select  appropriate  information  managers,  bodi  from  die 
standpoint  of  teclmical  skills  and  broad-scope  organizational 
understanding  skills. 


4  Avoid  building  iiironiiniiou  empires,  such  as  duougli 
creaiioii  of  despotic  moiinrclues  :uid  uifomiation  czars. 
E.xplicil  rccognilion  of  infonimuon  niamigeiiient  cultures  and 
miuiagiiig  diciii  coiisiniciiv  eK  is  suggested  as  die  bottom  line.  In 
coufoniuuice  to  our  disaissions  in  [  1 1  and  |2|,  we  would  prefer  to 
Gill  diese  iiifoniiaijoii  miuiagcmciii  styles,  and  information  cultures 
or  infoniiauoii  nvuuigcnicm  ailiures,  radier  dian  "information 
politics"  tcmi  used  b>  diese  audiors.  As  we  liave  noted  before,  there 
cUe  seiivuuic  differences  in  use  by  die  miuiy  workers  in  these  areas, 
hi  tenns  of  die  voGibukuy  we  use  liere,  die  organizational  political 
atltutcs  identified  b>  die  audiors  rue  die  anarcliist  and  the  feudalist 
models.  Tliese  are  not  to  be  encouraged.  On  this,  of  course,  we 
agree  strongly  widi  die  audiors  of  diis  e.xoellent  work. 

D.  Benchmarking 

A  benclininrk  is  much  like  a  breakpoint  [23].  Actually,  the 
concept  of  a  benchniaik  predates  diat  of  a  breakpoint.  Much  further 
discussion  orbcnclunaikiiig  is  awiibble  in  |24]  [25]  [26].  A  recent 
and  definitive  woik  bv  Watson  [27]  provides  an  excellent  overview 
of  niiuiy  airrcni  cflbns  conccniiiig  bciicluiiatking.  Ife  identifies 
four  types  of  bcnclimajldjig  efforts. 

1.  Internal  bench niackui)’  involves  observations  taken  entiiely 
widiiii  die  same  orgiuiization,  generally  of  the  best  practices 
diat  have  resulted  in  one  segment  of  the  organization  that  are 
desired  to  be  transferred  into  anther  segment  of  the 
organiztidoa  Internal  benchmarking  efforts  at 
Hewlett-Packard  are  described. 

2.  Competitive  benchmarking  involves  targeted  best  practices 
in  mi  e.xtenial  orgmiizatioii.  Tliese  can  be  at  the  level  cf 
^'Sterns  managenieiiL  processes,  or  product  Often,  much 
secondin'  research  is  used  for  competitive  benchmarking 
Tlie  compclilive  bencliiiiaiking  studies  of  die  Ford  Motor 
Company,  mid  dieir  use  of  diese  in  developing  Ford  Tarns, 
are  described. 

3.  Functional  benchmarking  is  oonoenied  with  performance 
investigation  vvidiin  a  specific  functional  area  for  an 
industiy-vvidc  fiuiciioii.  Tlie  results  of  functional 
benclinvuking  arc  especially  suited  to  identifying  process 
iiiiprov  eincm  related  bciicluivarks.  Tlie  efforts  of  the  General 
Motors  Corporation  in  September  1994,  which  led  to  major 
GM  strides  to  ciiliiuicc  product  quality  and  reliability,  are 
described. 

4.  Generic  benchmarking  is  concerned  with  studying  the  best 
processes  in  actual  use  in  practice  in  a  given  organization 
such  as  to  ciuibic  muiiogous  developtnent  of  enlianoed 
processes  by  die  organization(s)  sponsoring  the 
benchmarking  study.  A  generic  benchmarking  study 
Xerox,  which  ultimately  led  to  understanding  and  analogous 
modification  mid  adoption  of  fiic^  of  die  sliipping  processes 
activities  of  die  aitalog  casiial<lodung  sales  organization  L. 
L.  Bam.  are  described.  Tliis  effort  was  so  successful  that  it 
led  to  esuiblisliniciu  of  die  Benclmiaiking  Effectiv'eness 
Stralcgv  Tguii  (BEST)  network  to  transfer  lessons  learned 
dirougliout  die  Xerox  Corporation. 
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In  our  terminology.  \ve  wotild  use  tlte  tenn  functioitnl  product 
bendunaiking  to  refer  to  benclunaridng  of  tire  directly  mciisurabic 
aspects  of  a  prochicL  Tliese  serve  to  describe  tlie  opcraliou:U 
capabilities  that  are  supplied  to  the  user  of  die  product  or  service. 
The  nonfunctional  atuibutes  of  a  product  are  tliosc  ootistniiiiis, 
alteraUes,  or  limitations  tliat  relate  to  tite  stiticluial  or  purposeful 
properties  of  the  product  that  are  not  a  port  of  tlie  fiuictional 
properties.  The  nonflmctional  attributes  of  a  product  icLale  to  such 
important  characteristics  of  a  product  as  reliability,  maintainability, 
quality,  and  availability.  Thus  vvhat  was  aocomplislted  liere  would 
not  be  described,  using  our  terminology,  as  functional  product 
benchmarking.  It  is  really  nonfunctional  product  beiKhinarking. 
The  stuc^  conducted  by  GM  was  not  restricted  to  autonvstive 
products.  It  was  a  sturfy  of  quality  processes  at  1 1  cooperating 
organizations  in  a  variety  of  product  areas.  They  had  establislied  10 
hypotheses  for  empirical  evaluation  and  these  relate  very  closely  to 
qiEtlity  processes.  It  would,  therefore,  be  very  appropriate  to  denote 
this  type  rfbenchmaiking  as functional  process  benchmarking. 

Ldffi  a  breakpoint,  a  benchmark  involves  a  baseline  compiirison 
of  existing  practices  of  an  organization  with  tltose  best  practices  as 
used  by  one  organization,  or  perhaps  many  other  org<u\batioirs. 
These  practioes  may  be  at  the  level  of  product,  process,  or  ^’stems 
management  Breakpoints  and  bendunaiks  have  tlie  same  ultinvtic 
purpose,  that  of  enabling  and  enhancing  otganizaiion<il 
improvements  for  customer  satisfoctioa  The  use  of  a  benclunark 
can  be  reactive,  in  which  case  we  desire  to  emulate  tlve  best 
practioes  of  others  that  have  already  been  establisited,  or  pertiaps 
excel  in  these.  In  can  be  interactive  or  anticipative.  in  witidi  case, 
we  desire  to  foresee  efforts  that  are  now  occurring  and  adjust 
organizational  practices,  more  or  less  in  teal  time,  to  keep  abreast 
of  the  oompetitioa  We  can  use  benchmarks  for  forecasting 
purposes  such  that  we  evolve  entirely  new  forms  of  organizalioniU 
results  in  the  form  of  system  management  strategies,  processes,  and 
products  that  are  each  world  class  in  quality  and  trustwortliiness.  In 
ail  cases,  benchmarks  are  based  on  systematic  measurements,  boU) 
internal  and  external  measurements,  and  serve  as  a  catalyst  for 
action  and  results.  In  this  regard,  benchmarks  are  also  very'  much 
related  to  Critical  Success  Factors  (CSF)  [28]  (29). 

We  see  that  a  benchmark  is  a  standard  of  e.\oellenoe  or 
achievement,  or  critical  success  foctor  tliat  provides  a  baseline 
against  which  to  measure  or  evaluate,  or  odiervvise  judge,  simikir 
entities.  While  the  bendunaiking  tuxion  may  seem  quite  simple  at 
first,  there  are  really  a  number  of  benchmarking  foiures  Uwi  need 
to  be  examined.  There  are  at  least  seven  important  and  interrelated 
questions  that  need  to  be  asked  before  we  benchntaik. 

1.  do  we  benchmark? 

2.  do  we  benclunark? 

3.  Which  benchmark  measures  do  we  need? 

4.  Ifhodowebendmiark? 

5.  W^ercdovvebenduTiaik? 

6.  When  do  we  benchmark? 

7.  do  we  benchmark? 

We  can  and  shaild  ask  tliese  questions  of  our  organizatioa  we  cui 
and  should  ask  these  questions  of  tlvose  organizations  that  represent 
competitors.  We  can  and  sliould  ask  tliese  questions  in  Uie  search 


for  sUuaund,  fiutction:il,  iuid  piuposcfui  answers.  A  similar  set 
needs  to  be  <iskod  afler  we  bciiduiuuk.  Bendunaiking  dfoits  can 
be  conducted  icLuive  to  issues  at  Uie  level  of  systems  management, 
process,  or  product  Answers  to  tliese  questions  need  to  be  obtained 
in  tcniis  of  impliaitioiis  for  eacli  of  tliese  lev  els. 

Tliere  cui  be  some  obvious  etJucil,  moral,  and  legal  oonoems 
rekitive  to  bciiduivuking.  Some  cispccts  of  benchmarking  may 
seem  to  iunount  to  spyuig  or  espioiuge.  Clearly,  success 
beuduiuuking  should  and  can  not  involve  beliavior  that  is  either 
inuiioral,  unetliicaL  or  illegal.  One  of  tlie  major  activities  of  the 
hiteniatiotvil  Bendunarking  Cletuingliouse  Committee  of  the 
American  Productivity  tuid  Quality  Center  (APQC)  have  resulted 
in  a  set  of  guidelines,  knovvv  as  a  Code  of  Conduct,  for 
bcnduiuuking  tliat  liave  been  subscribed  to  by  a  oonsidetable 
number  of  companies  and  which  are  generally  fdt  to  be  above 
reproach  |30].  Tliere  are  seven  priixnples  in  the  Benchmarking 
Code  of  Conduct.  Tliey  may  be  summarized  as  follows 

1 .  Legality  -  Tlie  acqtusition  of  trade  secrets  is  proscribed,  as  is 
doing  bcnclmuukmg  wiilioui  first  requesting  approval  by  the 
bciichnuukcc  All  iiclions  luid  intents  sliould  be  legal  ones. 

2  Exchange  -  Tlie  siuiie  type  iuid  level  of  information  should 
be  provided  botli  by  llic  bcncluivuked  organization  and  the 
organization  doing  tlic  bendunaiking  in  an  openly 
ooninuuiicated  evclumge. 

3.  Confidentiaiity  •  Bendunaiking  communications  sliould  be 
considered  as  confidential  to  tlie  organizations  involved  and 
obtained  information  concerning  benchmarking  sliould  not 
be  divulged  outside  tlie  concerned  organizations  without 
prior  consent  of  all  parties. 

4.  Use  -  Bendunarking  infoniiation  sliould  only  be  used  for 
fonnulntion  of  improvement  options  for  processes  or 
products  witliin  orgtuiizations  paitidpating  in  the  study. 
Atuibution  of  beiKlimniking  partner  names  requires  prior 
pennission,  iUid  bcnduiuuking  infoniiation  must  not  be  used 
for  marketing  or  sales  purposes. 

5.  Fust-Party  Contact  -  Bendunarking  information  contacts 
vviiJi  a  partner  organization  should  be  made  througli  tlie 
point  of  contact  establislied  by  tlie  bendunarking  partner. 
Mutual  agreement  must  be  readied  for  delegation  of 
re^nsibility  in  Uiis  regard  to  oilier  ptuties. 

6.  ntird-Purty  Contact  -  Prior  pennission  must  be  obtained 
before  divulging  tlie  name  of  iJie  bendunarking  point  of 
contact  to  a  p;uiy  or  in  iui  open  forunt 

7.  Preparation  -  A  bcnclinvuking  contact  with  tlie  poiiU  of 
contaa  at  luioUicr  organization  sliould  only  be  made  after 
proper  plaiuiing  and  preptuation  of  an  interview  guide,  in 
order  to  nvike  Uie  encounter  eftident  and  effective. 

8.  Completion  -  Each  bcncluiuuking  study  must  be  completed 
to  Uie  satisfaction  of  all  bcnclureuking  partners  in  a  timely 
maiuicr,  as  agreed  to  prior  to  Uie  study. 

9.  UmierstamUng  ami  Action  -  All  partners  should  be  treated 
wiUi  miiunl  respect  tuid  luiderstanding.  and  infomiation 
should  be  used  as  niutiuJly  agreed 

It  seems  quite  dear  Ui;«  Uicsc  conduct  codes  are  also  a}plicable  as 
eUiicaJ  codes  of  conduct,  as  well  as  being  very'  useful  standards  for 
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benchtnaiking  pntctioe.  Wc  caii  obUiin  bcivcluiuuks  at  liic  Icn’cIs  of 
systems  management,  process,  and  product 

E.  Integrated  Product  Development 

In  many  ways,  integrated  product  dev’elopment  is  an  exteitsion 
of  concurrent  engineering  as  discussed  in  131|  (32|  (33|.  It  is  also 
cfcnely  related  to  the  other  reengineering  apptonclies  we  describe 
here.  The  following  definition  seems  appropriate. 

Integrated  product  development  is  a  systems  management 
philosophy  and  approadi  that  uses  functional  teams  to 
produce  a  tfpdent  and  effective  process  for  the  ultimate 
deployment  of  a  product  titat  satiffies  customer  needs 
through  concurrent  (gspUcation  and  integration  of  all 
necessary  Ufecyde  processes. 

Integrated  Product  Development,  or  IPD,  involves  systems 
management,  leadersliip,  systems  engiiteering  processes,  tlte 
products  of  the  process,  concurrent  engineering  and  integration  of 
all  necessary  fiinctions  and  processes  throughout  the  organiTatioit 
to  result  in  a  cost-effective  product  tliat  provides  touil  quality  <uid 
customer  need  satisfaction,  generally  in  a  just-in-time  fiisliioa 

Thus,  IPD  is  an  organization's  product-developtnent  strategy'.  It 
is  focused  on  results.  It  also  addresses  the  otganizationtil  ixxxl  for 
continual  enhancement  of  efficiency  and  effectiveness  in  all  of  its 
processes  that  lead  to  a  product,  or  service.  There  are  nvitiy  focal 
points  for  iro.  Twelve  are  particularly  impoitant 

1.  A  customer  satiffaction  focus  is  needed  as  a  part  of 
competitive  strategy  and  is  (lie  result  of  a  successful 
competitive  strategy. 

2.  A  results  focus  and  a  product  focus  is  needed  in  order  to 
bring  about  total  customer  satisfiictioa 

3.  A  process  focus  is  needed  as  high  quality  conipetitive 
products  that  satisfy  customers  and  result  in  organizational 
success  come  fiom  efficient  and  effectiv'e  processes.  Tltis 
necessarily  requires  process  understanding 

4.  A  stnd^ic  plarming  and  marketing  focus  is  needed  to 
insure  that  product  and  process  lifecycles  are  fully  integrated 
throughout  all  organizational  fiinctions,  e.\temal  suppliers, 
and  customers. 

5.  A  concurrent  engineering  focus  is  needed  to  insitre  tliat  all 
functions  and  structures  associated  witli  fulfilling  customer 
requirements  are  applied  throughout  tlie  lifecycle  of  tlie 
product  to  insure  correct  people,  correct  pLace.  correct 
product,  and  correct  time  deployment 

6.  An  integration  engineering  focus  is  needed  to  insure  that 
relevant  processes  and  tlie  resulting  processes  fit  togctlier  in 
a  seamless  manner. 

7.  A  teamwork  and  communications  focus  is  needed  to  iitsure 
that  all  of  the  functional,  or  multifunctional,  teams  function 
^rnergistically  for  the  good  of  tlie  customer  and  organization. 

8.  A  peo[de  empowerment  focus  is  needed  such  tliat  all 
decisions  are  made  qualified  people  at  tlie  lowest  possible 
level  that  is  cotisistent  with  authority  and  responsibility. 
Empowerment  is  a  responsibilify  and  not  just  an  entitlement 
and  entails  commitment  and  rqipropriate  resource  allocation 
to  support  this  commitment. 


y.  A  systems  management  reengineering  focus  is  needed,  both 
at  j  levels  of  radicaJ  ;uid  rcvoluiionary  diange,  as  well  as 
for  evolutionary  change,  sucli  as  to  also  result  in  t^cal 
revolutionary,  or  evolutioituy  clianges  in  processes  and 
product 

10.  A  otgankutionai  cuture  and  leadership  focus  is  needed  in 
order  to  successfully  aooommodate  chaiyged  perspectives 
relative  to  customers,  total  quality,  results  and  products, 
processes,  employees,  and  organizational  structures. 

11.  A  mcCfiods,  toofs;  and  redf/tifUfiSj^Kus  is  needed  as  they  are 
needed  tluoughout  all  aspects  of  the  IFD  process  even 
tlirough  tliey  alone  will  not  bring  about  success 

12.  A  systenudic  measurements  focus,  primarily  on  proactive 
measurements  but  also  on  interactive  and  reactive 
measurements  is  needed  as  we  need  to  know  wliere  to  go 
and  where  we  art  now.  in  order  to  make  progress  towards 
getting  tliere. 

All  of  tltis  sliould  bring  about  liigli  quality,  continual  and 
cv'ohitionary,  and  perhaps  even  radical  and  revolutionary, 
iniprovemcnt  for  customer  stitisfaction.  Each  of  these  could  be 
e.\p(utdcd  into  a  series  of  questions  or  a  checklist,  and  used  to 
ev^iliiatc  tlic  potential  effectiveness  of  a  proposed  int^raled  product 
development  process  and  tetun.  Wliile  our  discussion  of  IPO  may 
make  it  seem  as  an  approach  uniquely  suitable  for  ^stem 
acquisition,  production,  or  procurement;  it  is  equally  applicable  to 
tlie  products  of  tlie  RDT&E  and  marketing  lifecycle. 

We  see  tliat  IPO  is  a  people,  organizations  and  technology 
foaiscs  effort  as  linked  u^ier  through  a  number  of  lifecycle 
processes  by  systems  managemenL  These  are  major  ingredients 
for  all  our  efforts  liete,  as  we  suggested  in  Figure  8.1.  The  major 
result  of  IPD  is  tlie  ability  to  make  optimum  decisions  within 
available  resources  and  to  eveaile  tliem  efficiently  and  effectively  in 
order  to  acliieve  tlirce  causally  linked  objectives; 

■  to  integrate  people,  organizations  and  tecluiology  into  a  set  of 
multifunctional  and  nctw'orked  product  development  teams 

■  to  increase  tlie  qiuility  and  timeliness  of  decisions  tluougli 
centrally  controlled,  docentralized  and  networked  operations 
iuid  tliereby 

■  to  completely  satisfy'  aistomcrs  tluough  quality  products  and 
services  tliat  fulfill  tlicir  c.\pcctations  and  meet  their  needs. 

Tlie  bottom  fine  is  clearly  aistomer  satisfiiction  through  quality, 
siiort  product  delivery  time,  redtml  cost  and  improved 
perfoninuicc  and  fiuictioivility.  Equally  supported  by  IPD  are 
orguiizational  objectives  for  enlianced  profit,  well-being  of 
iivanagemenL  luid  a  decisive  :uid  clear  focus  on  risk  and  risk 
amelioration 

Figure  10  illustrates  a  suggested  sequences  of  steps  and  pliases 
to  establish  an  integrated  product  development  process.  As  vvitli 
oilier  efforts  tliis  embodies  tlie  definition,  development,  and 
deplovinent  triage  we  Ivive  used  so  often  in  this  book.  As  the 
implcinciiialion  of  tlie  detailed  steps  in  tliese  pliases  is  relatively 
standard,  we  will  not  describe  tliem  furtlier  here. 

Appropriate  references  to  IPD  iiKlude  134)  and  135).  At  this 
point,  IPD  is  a  relatively  new  concept  ;uid  tliere  are  few  available 
references  on  tlie  subject. 
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Fig.  10  Phases  in  Integrated  Product  Development  Process 
F.  Product  Reengineering 

Reengineering  at  the  level  of  product  lias  tooeK'cd  much 
attention  in  recent  times,  especially  in  infomiation  tecluiolog>‘  :uid 
software  engineering  areas.  This  is  not  a  subject  tli^it  is  truly 
independent  of  reengineering  at  tlie  levels  of  eitlier  systems 
management  or  of  a  single  lifecycle  process 

As  we  noted  earlier,  much  of  product  reengineering  is  \ery 
dosety  associated  witli  reverse  engineering  to  recover  eidter  design 
specifications  or  user  requirements  Tliis  is  dien  followed 
refinement  of  these  rec|uiiements  and/or  specifications  and  die 
forward  engineering  to  result  in  an  improved  produa  Tlie  tenn 
reverse  engineeiing,  rather  than  reengineering,  we  used  in  one  of 
the  eaiiy  seminal  ptqiers  in  this  area  [36].  In  diis  woric.  as  well  as 
in  a  more  recent  chtqHer  on  die  subject  [37|.  die  following  efforts 
represent  both  the  taxonomy  of  and  pliases  for  wiiat  w  e  denote  here 
as  product  reengineering. 

1.  Forward  engineering  is  die  original  process  of  defining, 
developing,  and  depip)7iient  of  a  product  or  realizing  a 
system  concept  as  a  product 

2.  Reverse  engineering,  sometimes  called  inverse  cngiiiocriag, 
is  the  process  diougli  vvliich  a  given  s>'steiti  or  product  is 
examined  in  order  to  identify  or  specify  die  definition  of  die 
product  ddier  at  die  level  of  leclmological  design 
specifications  or  ^'stem  or  user  level  lequitenienls. 

2.1  Redoaunentatwn  is  a  subset  of  reverse  engineering  in 
whicdi  a  representation  of  die  subject  ^em  or  product  is 
recreated  for  the  purpose  of  generating  fiuictional 
exf^anations  of  original  system  beliavior  and,  perliaps 
more  importiuidy,  to  aid  die  reverse  engineering  team  in 
better  understanding  die  system  bodi  at  a  fiuictional  ;uid 
structural  level.  There  are  niuiiber  of  redocumentatioii  tools 
for  software  available  and  some  of  diese  are  cited  in  |37|. 
One  of  (he  major  purposes  of  ledocumentadoii  is  producing 
new  documentation  for  an  existing  pixxluct  where  die 


existing  doauiiciiuilioii  is  faulty,  mid  perliaps  virtually 
absent. 

2  2  Design  Recovery  is  a  subset  of  reverse  engineering  in 
vvliich  die  ledocunientation  knowledge  is  combined  with 
odier  efforts,  often  involving  die  personal  experiences  and 
knowledge  of  odiers  abait  die  ^'stem,  that  lead  to 
functional  abstractions  and  enhanced  product  or  system 
understmiding  at  die  level  of  functioa  structure,  and  even 
purpose.  We  would  prefer  to  call  this  deployment, 
development  (vvliich  would  iiKlude  design)  recovery,  and 
dcfiiiilion  recovery,  depending  upon  die  pliase  in  die 
reverse  engineering  lifecvcie  at  vvhidi  the  teoovery 
knowledge  is  obtained. 

3.  Restructuring  involves  tiansfomiation  of  die  leveise 
engineering  infomiation  concerning  die  origiiial  system 
structure  into  anodier  representation  form.  This  generally 
preserves  die  initiiil  fiuictionalify  of  the  original  system,  or 
modifies  it  sliglidy  in  a  piuposefully  manner  that  is  in  accord 
widi  die  user  requirements  for  die  reengineered  system  and 
die  way  in  which  dity  differ  fioiii  die  requirements  for  the 
initial  system.  For  our  purposes,  die  terms  deployment 
nestruauring.  development  restructuring,  and  definition 
restnictiuiiig  seem  to  be  appropriate  disaggregations  of  the 
restructuring  notion. 

4.  Reengineering  is,  as  defined  in  diese  efforts,  equivalent  to 
ledeveiopment  engineering,  renovation  engineering,  and 
leclariitition  engineeriiig.  Thus,  it  is  more  related  to 
nviinteivince  and  reuse  diari  the  other  forms  of  systems 
inarvigement  and  process  reengineering  that  we  have 
discussed  in  diis  clviptcr.  Reengineering  is  the  recreation  of 
essentially  die  origiiial  ^'steni  in  a  new  form  that  has 
improved  structure  but  generally  not  much  altered  purpose 
and  functioa  Tlie  nonfuiictioa'il  aspects  of  the  new  ^stem 
may  be  considerably  different  fioin  diose  of  the  original 
sy'steia  especially  with  respect  to  quality  and  reliability. 

Figure  4.  which  illustrates  product  reengineering,  involves 
essenliaJIy  diese  six  activities 

We  can  recast  diis  by  considering  a  single  phase  for  definitioa 
for  dev  elopment,  and  for  deployment  dial  is  exercised  three  times. 
We  dicn  see  dvit  there  is  a  need  for  recovery,  redocumenlatioa  and 
restnicturing  as  a  result  of  die  reverse  engineeriiig  product  obtained 
at  each  of  die  duee  basis  pliases.  Tliis  leads  us  to  suggest  Figure  1 1 
as  lui  allcnvitive  way  to  tepreseiil  Figure  4  and  an  interpretation  of 
die  representations  used  in  136]  and  137],  Tlieir  discussions 
utilized  a  three  plvise  generic  lifecycle  of  requirements,  design  and 
iniplcnientatioii  hi  diis  representation,  implementation  contains 
some  of  die  detailed  design  mid  production  efforts  of  our 
development  pliase  and  potentially  less  of  die  iiiainlenance  efforts 
duit  follow  initial  fielding  of  die  CT'steia  The  restructuring  effort 
based  on  recoveiy  and  ledoctuiieiitation  knowledge  obtained  in 
reverse  ciigiiieeriiig.  is  used  to  effect  deplqyment  restruauring, 
devclopmetit  restructuring,  mid  definition  restnicturing.  To  these 
rcslnicturcd  products,  vvliicli  miglu  well  be  considered  as  reusable 
products,  we  augment  die  knowledge  mid  results  obtained  by 
detailed  consideration  of  poleiitially  augmented  requirements. 


These  augmented  requircincnts  are  muisLated,  togedier  wiiii  Uic 
results  of  the  restiucturing  clTorts.  into  Ute  outputs  of  tiic 
reengineering  elTott  at  die  \nrious  pluses  to  ultimately  result  in  die 
reengineered  product 


For  the  most  part,  diis  is  also  the  per^iecdN'e  uiken  on 
reengineering  in  a  recent  definitive  reprint  book  on  soflwarc 
reengineering  [38],  especially  in  the  lead  article  by  the  editor  of  diis 
work  [39]  that  takes  an  inherendy  transformational  view  of  product 
reengineering  Tlus  reprint  book  is  much  concerned  widi  die 
major  ingredients  needed  for  software  product  reengineering  as 
summarizied  in  Figure  12. 
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Fig.  12  Major  Product  Reengineering  Issues 


Arnold  indicates  diat  there  are  three  basic  cktsscs  of 
transformational  views. 

1.  Nonprocedurai  vUnvs  are  meta-level  vieivs,  such  its  decision 
tables,  event  trees,  attribute  trees,  data  scliemas,  user 
requirements,  and  system  specifications.  Tlicre  do  not 
represent  views  of  die  actual  endt)’  but  a  view  of  a  \iew  of  die 


enui>  or.  m  oilier  words,  salient  clviracterislics  of  die  entity. 
A  non  procedimil  view  is  a  purposeful  view , 

2  Proceitural  viex^s  coiiuun  direct  information  about 
procedures  or  rcpresenuiuons,  or  informadon  intimately 
associated  widi  diis  infomiadon.  Source  code,  arvl  the  objects 
and  enddes  of  object  oriented  languages  are  procedural 
views.  A  procodunil  view  is  a  Auicdonal  view. 

3.  Psuedopnicedural  views,  or  arcliitectuially  oriented  views, 
contain  perspeedves  bodi  of  procedural  and  nonprooedural 
views.  Hierarchy  charts,  structural  models,  data  flow 
diagrams,  eiidly-reladonsliip  diagrams,  and  Petri  nets  are 
examples  of  psuedoprocedural  views.  A  psuedoprooeduial 
view  could  also  be  called  a  structural  view. 

We  GUI  also  liave  views  Uiat  are  derived  fiom  analysis  of  one,  of  in 
some  other  way  derived  from  one,  of  the  three  basic  view 
categories  Arnold  denotes  these  as  analysis  views.  For  the  most 
part,  purposefiJ  viavs  or  nonprocedural  views  are  associated  with 
die  definidoiial  plmses  of  die  lifecycle  for  product  acquisidoa  Th^ 
concern  user  requirements  and  teclmological  spedficadons. 
Fiuicdonal  or  procedural  viavs  tend  to  be  associated  with  the  very 
end  of  die  da  clopmcnt  pliasc  of  die  lifecycle  and  the  deployment 
pliase  vvlicii  systems  nvay  be  diouglit  of  in  terms  of  their  input 
aiipiu  clmnicterisdcs.  Psuedoprocedural,  or  architectural  or 
stnictural.  viavs  tend  to  be  associated  widi  the  earlier  phases  of 
system  daelopmeiiL  One  of  die  major  purposes  of  both  forward 
and  raeise  engineering  and  tools  that  support  these,  is  to  enable 
tnuisfomiadon  from  one  viav  to  anodier  such  as  to  ultimately 
oblaiii  a  fiuicdoivally  useful  prodiia 

Artxild  [39]  indicates  several  potendal  uses  and  dharacterisdes 
of  product  reengineering  These,  which  are  neither  mutually 
exclusive  nor  ooUecdvely  exliausdve.  include  the  following 

1.  Rcciiginecruig  may  lielp  reduce  an  oiganizadon's  risk  of 
product  evoludon  duougli  what  effectively  amounts  to  reuse 
of  proven  subproducts. 

2.  Reengineering  may  Iielp  and  organizadon  recoup  its  product 
developmeiil  expenses  Uuougli  constructing  new  products 
Uvil  are  based  on  exisdiig  products 

3.  Rcciigineeruig  may  make  products  easier  to  modify  for 
purposes  of  acoommodadiig  ev  olving  customer  needs. 

4.  Rociigiiicering  may  be  a  GiUilyst  for  automating  product 
maiiilcivuioc. 

5.  Reciigiiiccriiig  nviy  be  a  analyst  for  applicadon  of  new 
tocluiologics,  such  as  CASE  tools  and  artificial  intelligence, 
to  system  acquisidon. 

6.  Rcciigirieeriiig  is  big  business,  e^xxially  considering  die 
major  invesunenl  in  legicy  systems  that  need  to  be  updated 
and  maiiitained. 

In  short,  reengineering  provides  a  mcclianism  that  enables  us  to 
tuidcrsumd  sy  stems  better,  such  dial  we  are  capable  of  extending 
Uiis  knowledge  to  new  and  better  systems.  Tlius,  it  enhances  botli 
luidcrsuuidiiig  and  improvcmcnl  abilidcs.  Reengineering  is 
aocomptuiicd  wiUi  a  variely  of  risks  tliiit  are  associated  with 
processes,  people,  look,  strategies,  and  die  applicadon  area  for 
rccnginocriiig.  Tlicse  risks  can  be  nvuiagod  using  tlic 
mcUiodologics  :uid  die  metrics  discussed  12]  and  [14]. 
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A  number  of  autliors  Iw'e  suggested  specific  lifcc>cles  tliat  will 
lead  to  determination  of  a  decision  to,  or  not  to.  reengineer  a 
product  and,  in  support  of  a  positive  decision,  eivtble  a  prodna 
reengineering  lifecycle  [401  (41].  Tliere  arc  a  luunber  of  needed 
aooon^)lishments.  These  include  tlie  following. 

1.  Initially,  there  exists  a  need  for  fonnulation,  assessment,  and 
implementation  of  deHnitioiiiil  issues  associated  witli  tlie 
technical  aixl  organizatioivil  enviioiunenL  Tltese  issues 
indude  organizational  needs  relative  to  tlte  area  under 
oonstdeiation,  and  die  extent  to  which  tecluiology  and  tlie 
product  or  ^em  under  reengineering  consideration 
suppoits  these  organizational  needs. 

2.  Identification  and  evaluation  of  options  for  continued 
devdo|xnent  and  maintenance  of  tlte  product(s)  uitder 
consideration  is  a  need,  including  an  option  for  outsourcing 
this  activity. 

3.  Formulation  and  etaluation  of  options  for  composition  of  tlte 
reengiiteering  team,  including  insourcing  and  outsourcing 
possibilities  is  a  need 

4.  Identification  aitd  sdection  of  a  program  of  systematic 
measurements  tltat  will  ettable  demonstration  of  cost 
efficiency  of  tlte  identified  reengiitoering  options  lutd 
selection  of  a  chosen  set  of  options  is  requited 

5.  The  existing  l^acy  systems  in  tlte  organizations  iteod  to  be 
examined  in  order  to  detenitine  tlte  extent  to  wliiclt  tltese 
existing  systems  are  functionally  useless  at  present  aitd  in 
need  of  total  replacement,  functionally  useful  but  witit 
furtctional  and  nonfurtctional  defects  tliat  could  potenually  be 
remedied  using  product  reengineerutg  to  create  renovated 
systems,  or  systems  that  are  fully  appropriate  for  tlte  current 
and  intended  future  uses. 

6.  A  suite  (tf'tools  and  methods  to  enable  reengineering  needs  to 
be  established  Metlxxl  and  tool  anal)^^  and  integration  is  a 
need  in  order  to  pro\ide  for  multiple  perspective  \ie\vs  across 
the  various  abstraction  levels  (procedural,  psuedoprocedund 
artd  nonprocedural)  tltat  will  be  encountered  in  neenginoering 
is  needed 

7.  A  reengineering  process  for  product  reenginoering  needs  to 
be  created  on  the  basis  of  the  results  of  these  earlier  steps  tltat 
will  provide  for  the  reengineering  of  complete  products,  or 
reengineering  of  systems,  and  for  incremental  reenginoering 
efforts  that  are  pltased  in  over  time. 

8.  There  must  be  major  provisions  for  education  and  training 
such  that  it  becomes  possible  to  implement  w1tate\er 
reengineering  process  eventuates  Utiough  educition  :utd 
training. 

This  is  more  of  a  citecklist  of  needed  aocomplislunenis  for  a 
reengineering  process  titan  it  is  a  specilicatioit  of  a  lifecy  cle  for  llte 
process  itself  Througli  pursual  of  tltis  citecklist.  we  sitottid  be  able 
to  establish  an  appropriate  process  for  product  lecitgiitceriitg  in  tlte 
form  of  Figures  4  or  1 1 . 

There  are  several  iteeds  tltat  must  be  considered  if  a  product 
reengineering  process  is  to  yield  appropriate  and  uscfitl  results. 

1 .  There  is  a  need  to  cottsider  orgtutizatiottal  and  teclutologicttl 
issues  to  develop  useful  product  reengineering  strategy . 


2  Tlicre  is  a  need  to  consider  liunuut,  leadcrslup,  aitd  cultural 
issues,  and  Itow  tltese  will  be  iiup;icted  by  tlte  development 
iuid  deployntciu  of  a  reengineered  produa  as  a  piut  of  the 
definition  of  tlie  specifiaiuons  for  tlte  reengineered  ptoducL 

3  It  must  be  possible  to  demonstr.iic  tltat  tlte  reeitgineering 
process  and  product  ;ire.  or  will  be.  each  cost  cffectixe  and  of 
liiglt  quality,  tutd  tlutt  tltey  support  coituitued  evolution  of 
future  capabilities. 

4.  Reengineered  products  mrest  be  oonsidered  wiiliin  a  larger 
framework  tltat  also  coitsidcrs  tlte  potential  need  for 
reengineerutg  at  tlte  levels  of  systems  management  and 
orgiutizational  processes  as  it  will  generally  be  a  mistake  to 
assume  that  teclutologiatl  fixes  only  will  resohe 
orgiutizational  diffiailtics  at  tltese  levels. 

5.  Product  reengirtceting  for  improved  post  deplqymenl 
maiitUiinability  must  coitsidcr  ntaintainability  at  the  level  of 
process  rather  tlvut  at  tlte  level  of  product  only,  such  as  would 
result  in  tlte  case  of  software  tluougli  rewriting  source  code 
statements.  Use  of  ntodcl  based  miutagcntent  systems  or  code 
geiterators  sltaild  yield  ntucit  grcciter  productivity,  in  this 
comtection,  tluut  revvriung  code  at  tlte  level  of  source  code. 

6  Product  reeitgiitccring  ntust  consider  tlte  need  for 
reintegration  of  tlte  recngmcerod  product  in  witli  existing 
legacy  systems  tltat  Itave  not  been  reengineered 

7.  Prodtict  reengineerutg  sltould  be  such  tliat  increased 
oonfomvuicc  to  sUutduds  is  a  result  of  tlte  reengineering 
process. 

8.  Product  reengineering  must  consider  legal  issues  associated 
witli  rcv'crse  engineering 

Tlic  importance  of  most  of  tltese  issues  is  relatively  self  evidenL 
Issues  stirroiuKling  legality  arc  in  a  state  of  flux  in  much  the  same 
w-ay  as  for  bcnclutvuking 

It  is  clearly  legal  for  :ut  orguiization  to  reverse  engineer  a 
product  tltat  it  owns.  Also,  tlicrc  e.xists  little  debate  at  tltis  time  on 
wlicUtcr  inferring  purpose  from  tlte  analysis  of  existing 
fiuictionality  of  a  product  luid  witliout  any  attempt  to  examine  tlte 
arciiitectunil  structure  or  detailed  components  that  comprise  the 
existing  product,  and  tlicn  nxaptiuing  tlte  functionality  in  terms  of 
a  new  development  effort  (tlte  so  called  bLrck  box  approach),  is 
legiil.  Doubtlessly,  it  is  legal.  Major  questions  surround  the  legality 
of  "vvliite  box"  icvcrse  engineering  in  which  tlte  detailed 
architcaural  suucturc  and  components  of  a  system,  including  code 
for  softw'.irc.  are  e.vunincd  in  order  to  reverse  engineer  and 
reengineer  if  Tlte  major  diffiailty  appetus  to  surround  tlte  fair  use 
provisions  in  copyriglit  law  .  tuid  tlte  fact  tltat  it  is  tlie  use  of  trade 
secrets  for  illicit  giin.  Copyrighted  nuiterial  carutot  be  trade  secret 
since  tlte  copyriglit  law  requires  disclosure  of  the  material 
cc^yrigliicd.  In  partiailar,  softw.ire  is  copyriglit  and  not  patented 
So,  uadc-secrct  restrictions  do  not  apply.  Tliere  is  a  pragmatist 
group  tltat  says  even  black  box  reengineering  is  legal  and  a 
constnictionist  group  tltat  says  it  is  illegal  (42|  (43).  Tltese  issues 
will  be  tlte  subjea  of  much  debate  over  tlie  near  term 

Arnold  [441  identified  ntiuty  of  tltese  product  reengineering 
needs  in  tlte  fonn  of  risks  tltat  ntust  be  mmagod  during  the 
reengineering  elTort  Tlius.  we  sec  tliat  tlicre  are  risks  associated 


14 


with  a  vaneiy  of  £ictors  for  product  rccngiitccring.  :is  suggested  iii 
Figure  12. 

■  Integration  risk  is  tlte  risk  iissociatod  wiili  leiviug  a 
reengineered  proditct  tlvU  caiutot  be  silisfactonh  uuegrate 
with,  or  interOioe  to.  e.Kisting  legac>  systems. 

■  Maintenance  improvement  risk  is  tile  risk  (iuil  llic 
reengineered  product  will  exaoeibate,  ralJier  tltnii  amclionite. 
maintenance  difficulties. 

■  Sfystems  management  risk  is  the  risk  tliat  tlie  reengineered 
product  attempts  to  impose  a  technological  fix  on  a  sitieition 
where  the  major  difficulties  are  not  need  for  greater  support, 
but  for  organizational  reengineering  at  tite  iet'el  of  systems 
management. 

■  fiveess  risk  is  that  associated  with  having  a  reengutecned 
product  that  miglu  well  represent  an  improvement  in  a 
situation  where  tlte  specific  organizational  process  in  wliich 
the  reengineeied  process  is  to  be  used  is  defective  and  in  need 
of  reengineering 

■  Cost  risk  is  associated  with  having  major  cost  ovemms  in 
order  to  obtain  a  deployed  teengirteered  product  tliat  meets 
specifications. 

■  Schedule  risk  is  associated  witlt  liaving  schedule  dcLiys  in 
order  to  obtain  a  depleted  reengineered  product  ihu  meets 
specifications. 

■  Human  acceptance  risk  is  tlte  risk  associated  witli  obtaining 
a  reengineered  product  tliat  is  not  suitable  for  Inureui 
interaction,  or  one  tliat  is  unacceptable  to  the  user  organization 
for  other  reasons. 

■  Application  atpportabUity  risk  is  tliat  risk  associated  witii 
having  a  reenghieered  product  that  does  not  really  support  tlic 
application  or  purpose  it  was  intended  to  support 

■  Tool  ami  method  availability  risk  is  assodated  vvitli 
proceeding  with  reengineering  a  product  based  upon  promises 
for  a  method  or  tool,  needed  to  complete  tlie  effort  wliicli  doe 
not  become  available  or  which  is  fiiulty. 

■  Leadership,  strategy,  and  culture  risk  is  tliat  associated  witli 
imposing  a  technological  fix  in  the  fonn  of  a  reengineered 
product,  in  an  organizational  enviroiunent  tliat  caiuiot  adapt 
to  the  reengineered  product 


Figure  1 2  Some  Product  Reengineering  Risks 


Clculv.  tliese  risks  ;ue  not  iiuiUuillv  exclusive,  tlie  nsk  aiinbuics 
are  not  uidcpcndent  ;uid  liic  listing  is  incomplete  For  example,  we 
could  siuely  iiKludc  legal  risks  We  cui  use  this  as  tlie  basis  for  a 
mulu  atUibuie  type  assessment  or  for  a  decision  support  svsleni 
|4.v|  tJiat  supports  risk  maiiagcnieni  for  product  neusabilirt 

G.  Other  Appruuehev  and  Cunvideratiunv 

Tlicrc  are  a  number  of  reLilcd  approadies  Many,  as  noted  are 
discussed  in  [2)  Two  of  these  are  very  wortliy  of  further  comment 
liere  sourcing  and  integration 

Tlie  principles  supportmg  strategic  sourdng  decisions  are 
conceptually  simple  and  nicely  stated  by  Venkatesan  |46| 

1.  Tlic  organization  slxxild  focus  on  tlxse  components  and 
sub^^ns  that  are  cmcial  to  tlie  product  itself  and  where  tlie 
orgmization  has  critical  core  capabilities  (47)  |48|  that 
support  tlie  eObits  required  and  which  tlie  organization 
desires  to  sustain.  Tliis  eanbles  an  organization  to  exercise 
judgment  concerning  subsystems  that  are  strategic  and  those 
tlvat  are  nonstrategic  It  potentially  eliminates  difficulties  that 
rcsiJl  fiom  confficting  priorities  and  sourcing  decisions. 

2  Compoiienis  and  subsystems  sliould  be  outsourced  where 
tiicrc  c.xisl  potential  suppliers  witli  a  distinct  competitive 
adv'-.uilage  at  producing  tlicse.  Tliese  competitive  advantages 
could  be  eillier  tliose  of  lower  cost  producers  or  higher 
subsystem  difTeieiidation 

3  CXitsoutring  sliould  always  be  used  in  such  a  maiuicr  tliat  it 
supports  continuing  employee  oommilnient  and 
empowemienL  It  is  necessary  to  outsource  this  in  such  a 
nvuuier  that  there  is  miniimun  opportunity  for  exploitation 
mid  lioliowing  of  tlie  organization,  including  its  people,  by 
lire  external  supplier. 

TIicsc  principles  lead  to  a  process  for  strategic  sourcing  Lacity  and 
Hirsclilreim  have  publisircd  two  recent  works  on  infomiation 
sysicms  oiilsoiircing  [49)  |50)  in  vvliich  tlicy  identify  tluee  generic 
types  of  outsourcing 

■  Iknly  shop  outsourebig  is  a  way  to  nxxa  sliort  term  demands 
Ural  cuinot  be  met  by  people  intenvU  to  Uie  organization  even 
Uiougli  Uie  decision  would  oUierwise  be  lavorable  to 
insourcing. 

■  Project  mam^cment  outsourcing  imx>h’esii\e  use  olextcnvii 
suppliers  to  fiimisli  a  sub^stem  or  service  activify,  such  as 
uaining  This  would  seem  closely  equivalent  to  product  line 
or  subsystem  outsourcing. 

■  Total  outsourcing  exists  vvlienever  an  external  supplier  is 
responsible  for  all,  or  a  very  nvijor  portion  of  a  complete 
mnt-key'  like  infonivition  ^stern  fiuictioiL 

Tire  boiloni  line  summary  message  of  Uiese  auUiors  is  Uiat  one 
really  cuinot  oulsouroc  Urc  mruiagcmcnt  of  infomiation  systems. 
Tlrcre  seems  to  be  much  agreement  wiUi  Uiis,  especially  as 
concerns  infomvition  support  for  Uie  higliest  level  nemagerial 
decisions  [51]  152).  Partiailarly  vvlien  Uierc  is  considerable 
oulsoiucing  to  extenwl  suppliers,  Uiere  will  be  a  nuijor  mandate  for 
very  circfuJ  integration  of  <UI  aspects  of  Uic  supply’  cliain  and  an 
luidcrslanding  of  Uk  relaUonships  between  Uie  suRify  cliain  and 
Uic  value  drain. 
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There  are  also  a  luuober  of  uUerestuig  disaissioi\s  duii  rcLite  to 
integration  at  the  IcNel  of  iikiIkxIs  and  tools  153|  |54|  |55|  Wliilc 
of  interest  priinahly  for  product  rcetigiitoering.  ilicn:  is  nuicli 
relevance  to  process  lutd  systems  management  rcenginccniig  as 
well. 

rv.  Summary 

In  this  paper,  w  liave  considered  a  number  of  issues  rcLiUte  to 
systems  reengineering.  We  indicaied  tint  reenginoeruig  cur  take 
place  at  either,  or  all,  of  the  le\els  of 

■  product, 

■  prooess,or 

■  systems  managemenL 

Reengineeritig  at  ai^  of  these  levels  is  related  to  reengiiKermg  at 
the  other  tvro  levels.  Reengineering  can  be  viewed  from  tltc 
perspective  of  the  organuation  fielding  a  product  as  well  as  from 
the  perspective  of  the  customer,  individual  or  organizational 
receiving  the  producL  From  the  perspective  of  eitlter  of  tliese.  it 
may  well  turn  out  to  be  tlie  case  tliat  reengineering  at  tlie  level  of 
product  only  may  not  be  frilly  menningfril  if  this  is  not  also 
associated,  and  generally  driven  by,  reengineering  at  tlie  levels  of 
process  and  systems  management  For  an  organization  to 
reengineer  a  product  vvlicn  it  is  in  need  of  reengineering  at  tlic 
levels  of  systems  management  and/or  process  is  alnxist  a 
guarantee  of  a  reengiiiecrod  product  tltat  will  not  be  fully 
trustworthy  and  cost  effreient  An  organization  tlvit  contracts  for 
product  reengineering  vvtien  it  is  in  iwod  of  teenginoering  al  Utc 
levels  cf  systems  management  and/or  process  is  asking  for  a 
tedmological  fix  and  a  symptomatic  cure  for  difliculues  tliat  arc 
institutionally  and  value  related.  Such  solutions  arc  not  rcUly 
solutions  at  all. 

Figure  13  is  a  hypothetical  representation  of  potential  need  for 
reengineeting  at  the  levels  of  product,  process,  and  systems 
managemenL  While  it  may  well  be  the  case,  as  suggested  in  tlic 
figure,  that  product  reengineering  may  well  occupy  much 
resources,  the  combuied  total  of  resources  needed  for  systems 
management  and  process  reengineering  may'  be  not  insubstantial. 
What  the  figure  does  not  show  is  the  frict  that  resources  expeixlcd 
upon  product  reengineering  only,  and  witli  no  invesljg:)tion  of 
needs  at  the  systems  nvanagement  and  process  levels,  may  well  not 
be  wise  expenditures  -  firom  eitlier  tlie  penqoective  of  tlie 
organization  producing  the  product  or  tlie  one  consuming  it 

In  an  insightful  study  (56],  it  is  indicated  tliat  organizations 
(rflen  squander  resources  that  look  very  promising  but  vvliich  fail  to 
produce  long-lasting  results  of  value  for  tlie  orgiuiizalion.  Four 
major  ways  to  &il  are  identified. 

1.  Assigning  average  perfonners  to  tlie  reengiiiccruig  cfTort 
often  because  the  more  valuable  people  arc  needed  for  otiKr 
more  important  efibrts.  will  guarantee  mediocre  pcrfomuuicc 
of  the  product  of  tlie  reengiiieering  eflbrt 

2.  Measuring  the  lecnginecriitg  pLan  and  activities  only .  ;uk1  not 
the  results,  will  often  product  docepfrve  measiuemcnt  results 

3.  Allowing  new  aixl  innovative  ideas  for  reengiiieering  to  be 
squelched  tlirougli  opportunistic  politics  and  e\lreme  risk 


aversion  will  preserve  die  suilns  quo  rallver  Uvui  encourage 
implemcnLiuon  of  beneficial  ;icti\iues  in  die  fonn  or  results 
4  Fiuling  to  oonmuuiieaie  wisely  and  widely  during 
anpicinenuiuon  will  aiinosi  always  fnisirate  success 


fig  I  ^  Resoiiree  lii.sli  ibiilioii  to  Ijitpleincnl  Systems 
Reengineering 


To  Uiis  list,  we  might  add  Ciilurc  to  obtain  teal  oommitmenl  fiom 
die  higlicst  levels  of  die  orgiuii/alioii  for  die  reengineering  effort  It 
miglit  be  argues,  of  cxxirsc.  dial  diis  leads  to  such  things  as 
assigiuncnt  of  average  tuid  mediocre  perfonners  to  the 
tecngiiiccnng  effon  Tliese  aiidiors  also  offer  five  Actors  said  to 
enleuKC  success  at  reengineering 

; .  Sci  aggressive  rcenginccniig  perfomiiinoe  targets  in  tenns  of 
rest  ills. 

2.  Commit  a  significuil  pordoii  of  die  CEOs  dine  io  the 
nxngiixeriiig  cIToit  cspccitslly  during  deployment 

3.  /Assign  cl  very  senior  evccudve  to  liead  die  rocnginccriiig 
effon.  especially  during  dcplqyniciit. 

4.  Pcifonn  a  coiiiprclKiisive  review  and  analysis  of  customer 
needs.  orgiuiiznUonal  rcidilics.  strategic  eoonomic  issues,  and 
market  ueiids  as  a  prelude  to  teenguiocring 

3  Conduct  a  pilot  study  luid  prototype  die  reengineering  effon 
in  order  to  obtain  results  useful  both  to  refine  the 
reengineering  process  <uid  die  enlianoe  comniunicaUons  and 
bui'd  cnduisitisin 

while  die  study  was  btisod  primarily  on  organizational 
reengineering  efforts,  dierc  arc  clear  implications  in  diese 
suggestions  for  till  dtroc  types  of  reengineering  efforts 

■  systems  nuuLigcmcnt 

■  process.  ;uid 

■  product. 

/^jid,  as  we  litive  iiidiGitcd.  dierc  is  every  reason  why  due 
oonsidcradon  needs  to  be  given  to  all  diiee  of  these  efforts  in  an 
integrated  fashion  for  die  bcttcniKiit  of  each  effort. 
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ABSTRACT 

Given  the  increasing  maturity  of  computer-based 
systems,  more  and  more  systems  are  evolving  from  past 
systems  rather  than  being  developed  from  scratch.  However, 
large  and  complex  systems  have  put  dramatic  burdens  on 
the  systetrts  engineers  in  designing  these  {^plications.  The 
situation  has  become  even  more  complicated  by  the 
increased  introduction  of  commercial  off  the  shelf  (COTS) 
products  into  these  systems.  AdditionaUy,  many 
reengineering  efforts  have  only  focused  on  software 
reengineering, providing  little  insight  into  the  other  system 
fimctiorudity  not  embodied  in  the  software.  Today 
reengineeringtechnology  must  be  iraegrated  into  a  system- 
level  forward  engineeringframework  to  effectively  meet  the 
challenges  of  the  future  systems.  Four  critical  issues  that 
reengineeringtechrurlogy  must  address  in  order  to  effectively 
faciUtate  the  system-level  design  of  evolution  systems 
include:  (1)  the  ability  to  provide  means  to  recapture  rurt 
only  futuxiotud  and  control  (behavioral)  descriptions  cf  the 
system,  but  all  necessary  information  at  the  system  level; 
(2)  the  ability  to  merge  reengineered  information  with 
forward  engineeringinformation  at  the  system  level;  (3)  the 
ability  to  separate  yequiretT  functionality  from  legacy 
functionality  which  was  the  result  of  design  and  tirchitecture 
decisions;  (4)  and  the  ability  to  maintain  euul  reengineer 
system  level  designs.  Without  these  capabilities,  future 
systems  engineer  will  face  two  critical  shortcomings:  (1)  an 
exceedingly  restricted  design  space  during  system-level 
optimization  which  will  produce  ineffective  systems,  and  (2) 
fragile  implementations  which  are  difficult  to  modify  in 
response  to  either  requirements  changes  or  technology 
advancements.  This  paper  describes  a  framework  to  support 
both  reengineeringand  forward  engineeringinformation  and 
provides  a  means  to  manage  the  information.  The 
framework  provides  for  separation  of  concerns,  systematic 
capture  of  non-functional  attributes,  and  integrcuion  of 
system  information.  This  framework  can  also  be  used  to 
assess  the  ability  of  current  reengineeringtechnology  to  meet 
the  needs  of  system  level  design. 

INTRODUCTION 

For  computer-based  systems,  optimizing  a 
reengineered  static  or  semi-static  software  architecture  to 
fit  a  static  or  semi-static  distributed  computer  hardware 
implementation  late  in  the  application ’s  implementation 
provides  marginal  performance  enhancements.  Vastly 
increased  performance  can  be  gained  through  proper 
system  engineering.  The  computer-based  system 


engineering  of  large  and  distributed  applications  involves 
identifying  the  proper  definition  and  structure  of  the 
system  hmctionality  and  allocating  these  fimctions  to 
proper  resource  architectures.  Engineers  must  be  able  to 
effectively  trade-off  between  different  types  of  resources 
(i.e.,  hardware,  software,  and  humanware)  as  well  as 
alternative  resource  architectures,  to  meet  requirements 
for  real-time,  cost,  safety,  etc. 

A  key  issue  in  designing  a  real-time,  large-sized, 
complex  system  is  to  optimize  and  assess  the  design, 
based  on  multiple  competing  requirements,  early  and 
throughout  the  design  process.  However,  ‘bprimal* 
solutions  caimot  be  generated  by  addressing  various 
streets  of  the  system  singularly.  Non-fimctional  issues, 
including  financial  costs  (development,  production, 
mainterumce,  logistics),  physical  constraints,  timing, 
security,  dependability  (reliability,  safety),  maintairubility, 
etc.,  will  drive  the  system  design  as  much  as  ftmctioiudity 
for  most  large,  conq>lex  systems. 

Many  times,  effective  trade-offs  carmot  be 
performed  due  to  the  lack  of  design  descriptions. 
Complete  information  about  an  application  caimot  be 
specified  in  a  single  design  model  even  for  moderately 
sized  systems.  At  different  stages  of  the  design  process, 
the  designer  must  decide  which  information  is  important 
to  capture  and  at  what  level  of  detail.  Existing 
reengineering  technologies  only  focusses  on  certain  types 
of  design  information  (i.e.,  the  design  of  software 
algorithms  can  be  easily  recapture).  Reengineering  at 
other  levels  in  the  design  process  is  usually  a  labor- 
intensive  and  manual  process.  Existing  design  reflects  a 
specific  implementation  (i.e.,  a  particular  hardware 
platform  and/or  a  unique  software  language)  that  is 
selected  to  performed  the  desired  fimctions.  In  capturing 
an  existing  system  design,  system  engineers  do  not  have 
the  capability  to  distinguish  which  part  of  the  design  is 
driven  by  the  origiiul  requirements  and  which  part  is 
d^rendent  on  a  particular  design  decision.  Also,  reverse¬ 
engineering  from  code  tends  to  lack  non-fimctional 
information. 

For  most  computer-based  evolutionary  systems  the 
amount  of  the  former  system  reengineered  or  reused  may 
vary  tremendously.  For  instance,  the  reengineering  may 
facilitate  a  system  upgrade  where  only  a  small  percent  (on 
the  order  of  5-30%)  of  the  operational  code  is  noodifted. 
In  this  case  the  system  architecture  and  infrastructure  is 
not  likely  to  change  significantly.  Therefore,  the 
reengineering  efforts  must  integrate  into  the  existing 
system/software  design. 

On  the  other  hand,  if  the  evolution  involves  the 
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next  generation  of  a  system,  some  of  the  original  system 
level  design  might  be  reused;  however,  the  system 
architecture  and  infrastructure  would  likely  differ  vastly 
from  the  original  system.  This  appears  to  be  the  case  for 
the  Next  Attack  Submarine  (the  follow-on  to  the 
AN/BSY-2  SeawolO  *nd  Combat  System  2003  (the  follow- 
on  to  Aegis). 

For  the  development  of  large  and  complex 
systems,  requirements  derived  from  user  needs  are 
defined.  In  turn,  these  requirements  are  captured  and  an 
initial  design  is  produced  [Hoa91].  Analysis  is  executed  to 
assure  that  the  initial  design  is  complete  and  consistent 
[B1F90].  This  design  is  optimized  iteratively  until  a 
feasible  or  an  optimal  design  is  achieved  [HNH91], 
[HNH92].  Collect  results  are  then  passed  through  for 
rapid  prototyping,  assessment,  evaluation,  test  and 
refinement  to  yield  the  final  design  [BoB85],  [CYH9i], 
[JeY91],  [Kani91],  [SvL76].  The  design  components  are 
then  implemented,  integrated  and  tested.  This  description 
follows  a  waterfall  approach.  Other  models  used,  such  as 
modified  waterfall,  rapid  prototyping  or  spiral  model, 
follow  the  same  steps  in  at  different  levels  of  details  and 
different  orders.  In  actual  system  developments,  a  hybrid 
of  these  approaches  is  typically  used. 

This  paper  describes  a  framework  to  support  both 
reengineering  and  forward  engineering  information  and 
provides  a  means  to  manage  the  information.  The 
framework  provides  for  separation  of  concerns,  systematic 
capture  of  non-fimctional  attributes  and  integration  of 
system  information.  This  framework  can  also  be  used  to 
assess  the  ability  of  current  reengineering  technology  to 
meet  the  needs  of  system  level  design.  First,  a  capture 
framework  is  described.  Next,  additional  armotation 
capabilities  are  described  which  provide  a  basis  for 
engineering  decisions.  Optimization  of  system  designs 
using  the  annotation  and  capture  methods  in  an 
evolutionary  environment  are  also  addressed. 

SPECIFICATION  AND  INTEGRATIONISSUES 

Regardless  of  the  process  used  to  develop  the 
system,  a  lack  of  a  cohesive  and  coherent  engineering 
discipline  starts  with  a  shortage  of  formal  and  cohesive 
techniques  for  specifying  the  system  under  development. 
This  is  especially  true  for  system  level  design  capture  and 
design  analysis.  Most  large  systems  are  reactive  systems 
that  must  respond  to  external  stimuli;  therefore,  the 
systems  engineering  must  be  able  to  predict  the  system’s 
behavior  in  all  scenarios  and  environments  under  which 
the  system  will  operate  including  planned  mode  of 
operation  and  action,  and  reaction  to  failure  to  internal 
system  failure  (fault  tolerance)  or  external  inflicted 
damage  (damage  tolerance).  In  order  to  effectively  design 
these  systems,  definitions  of  the  multiple  design  domains 
or  views  which  address  the  principal  system  design 
perspectives  are  needed.  We  propose  five  Design  Capture 
Views  (DCV)  (Hoa91]  which  as  follows:  (1) 

Environmental,  (2)  Informational,  (3)  Functional,  (4) 


Behavioral,  and  (S)  Implementation.  The  capture 
approach  for  each  design  domain  or  capture  view  share  a 
common  hierarchial  structure  which  supports  marugement 
of  the  magnitude  and  complexity  associated  with  a  large 
system  design.  Flat  representations  of  complex  system 
designs  rapidly  become  unwieldy  as  the  design  detail 
unfolds.  A  hierarchial  structure  allows  the  system  capture 
views  to  be  represented  at  various  levels  of  detail  from  a 
broad  top  level,  which  encompasses  the  breadth  of  the 
system  and  its  external  interfaces,  to  very  low  levels,  which 
describe  the  details  of  a  particular  segment  of  the  system 
design. 

The  five  system  capture  views  partition  the  system 
design  into  logical  segments  which  correspond  to  key 
perspectives  of  the  system  design.  These  five  capture 
views  are  distinct  but  related  representations  of  key 
aspects  of  the  system.  They  are  summarized  in  Table  1-1. 
The  design  element  portion  of  the  table  describes  current 
generalized  methods  or  techniques  used  to  specify  the 
information.  Specific  methods  exist  for  capturing  the 
Informational,  Functional  and  Behavioral  capture  views 
(DeM79],  [HaP87],  [WaM8S],  [ShM88]  [Har86]. 

Implementation  and  Environmental  capture  views  lack 
mature  methods  for  computer  intensive  real-time  systems. 
The  following  paragraphs  describe  each  capture  view 
objective  in  more  detail. 

The  Informational  Capture  View  captures  a 
concq>tual  representation  of  the  system  under  design  in 
abstract  terms.  This  view  captures  all  components  that 
make  up  the  system  and  the  interaction  between  these 
components.  This  allows  for  a  description  of  the  intended 
system  concept  of  operations  (use  analysis)  without 
implying  or  constraining  the  physical  implementation  of 
the  system  under  design. 

The  Functional  Capture  View  establishes  the 
functional  structure  of  the  system.  It  specifies  how  the 
functions  are  decomposed  and  how  the  information  is 
transformed  through  these  functions.  This  provides  a 
better  understanding  of  the  system’s  functional 
decomposition. 

The  Behavioral  Capture  View  describes  the 
system ’s  attributes  over  time  and  describes  the  event  or 
time-driven  aspects  of  the  system.  This  view  allows  for 
specification  of  the  system’s  behavior  at  different  times 
and  under  various  conditions  and  situations.  This  provides 
a  mechanism  for  specific  real-time  and  time-critical 
aspects  of  the  system.  This  also  may  include  behavioral 
descriptions  of  dependability,  safety,  and  security. 

The  Implementation  Capture  View  defines  the 
physical  hardware,  software  and  human  resources  which 
comprise  the  system  and  its  connectivity  to  external 
systems  [HoK92].  This  is  where  alternative  physical 
system  architectures  are  defined. 

The  Environmental  Capture  View  captures  the 
system  under  design  from  an  external  viewpoint  [Kai92]. 
This  view  describes  the  situation(s),  environnient(s), 
expected  events  and  other  factors  that  make  up  the 
conditions  under  which  the  system  is  operating.  This 
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TABLE  1.  SYSTEM  CAPTURE  VIEWS 


DESIGN  CAPTURE  VIEW 

VIEW  OBJECTIVES 

DESIGN  NfETHODS 

laformational  Capture  View 

• 

Chenclcfizef  f/ftem  concept  of  opcniione 

•  Entity-relationship  diagrams 

• 

Repreients  tyitem  component!  in  abitract  terms 

■  Attribute/method  descriptions 

FUnctioual  Capture  View 

■ 

Defines  system  functions  and  decompositions 

•  FuiKtion/data  flow  diagrams 

■ 

Specifies  data  flow  requirements 

■  Process  specifications 

a  Data  dictionary 

Behavioral  Capture  View 

• 

Defines  system  stales  and  trigger  events 

a  Control  flow  diagrams 

■ 

Specifics  system  behavior  chancteiistics 

a  State  transition  diagrams 
a  CoiHrol  specifications 

laipleuientatioii  Capture  View 

m 

Defuses  the  physical  hardwan,  software,  and 
human  resources  which  make  up  the  system 

a  Hardware,  software,  and  human  resource 
deacriptioiu 

m 

Specifies  system  physical  iateiconnectivity 

a  Performance  parameters  and  resource 
characteriatics 

a  Functkm-reaource  mapping 

Euviruaaicutal  Capture  View 

m 

Ealabliahea  conditions  and  events  cotutiaining 
system  operations 

a  Environmental  conditiona  and  event 
deacfiptiona 

• 

Specifies  performance  Measure  Of  EfTectiveness 
(MOEa)  and  conditions  of  measurement 

a  External  system  descriptioru 
a  System  initial  conditions 
a  MOEs 

includes  the  following:  initial  state  of  the  system  under 
design;  environmental  conditions  including  acoustic, 
electromagnetic,  and  meteorological  conditions;  threat 
types  and  locations  (in  a  military  application);  operational 
constraints;  likely  strategic  and  tactical  considerations  and 
other  pertinent  items;  and  conc^t  of  operations.  The 
MOEs  which  characterize  system  performance  and 
establish  the  ‘Success  criteria*  for  the  system  are  also 
specified  together  with  the  conditions  under  which  the 
MOEs  are  measured.  The  Environmental  Capture  View 
also  specifies  the  guidance  and  constraint  of  the 
envirorunent  which  the  design  itself  must  face. 

An  attempt  to  address  all  of  the  issues  associated 
with  these  capture  views  simultaneously  or  without  a 
structured  methodology  is  a  multi-dimensional  problem  of 
a  magnitude  which  exceeds  the  capacity  of  most,  if  not  all, 
systems  engineers.  Each  of  these  capture  views  provides 
key  information  concerning  particular  aspects  of  the 
system  under  design.  Taken  individually  the  capture  views 
allow  the  systems  engineer  to  partition  the  design  of  a 
proposed  or  existing  system  into  marugeable  parts. 

As  systems  increase  in  complexity  and  size,  the 
capturing  design  methods  have  to  scale  so  that  they  are 
able  to  capture  different  aspects  of  the  system  in  a 
complete  and  systematic  matmer.  Deciding  what  aspect  of 
the  system  needs  to  be  captured  and  to  what  level  of  detail 
is  a  very  difficult  task,  llie  group  of  descriptions  used  to 
capture  one  type  of  system  may  be  irrelevant  or 
incomplete  for  another  system.  Therefore,  the  ideal 
situation  is  to  define  the  capturing  method  for  ail  aspects 
of  the  system  and,  depending  on  the  system  requirements 
or  on  the  design  phases,  emphasize  or  deemphasize  certain 
system  capture  views. 


SYSTEM  DESIGN  ANNOTATION 

One  key  to  designing  a  real-time,  large,  complex 
system  is  to  optimize  the  design  to  meet  the  requirements 
and  desired  MOEs.  In  order  to  achieve  this,  the  system 
engineer/analyst  must  have  the  capability  to  armotate  the 
system  design  with  design  goals/criteria  that  relate  to  the 
requirements  and  MOEs.  Whether  the  system  design 
emphasizes  real-time,  reliability,  safety,  security,  cost, 
physical  constraints,  or  any  specific  criteria,  a  set  of  design 
goals  is  required  to  describe  the  desired  characteristics  of 
the  system.  This  set  of  design  goals  provides  a  ftamework 
for  the  specification  of  critical  information  from  which 
system’s  qualities  and  performances  can  be  measured. 
The  design  goals  also  provide  a  basis  for  the  trade-off 
between  design  criteria  and  design  alternatives.  One 
mechanism  that  allows  these  design  goals  to  be  specified 
is  System  Design  Factor  (SDF)  [NHL93].  SDFs  are 
lattributes  *  associated  with  any  design  element  in  the 
design.  A  Design  Element  (DE)  is  a  set  of  one  or  more 
design  conqxinents  such  as  functions,  dataflows,  states, 
objects,  or  relationships.  Regardless  of  the  employment 
of  any  system  development  process,  the  SDFs  can  have 
major  influence  in  various  design  activities  within  the 
process  (i.e.,  design  capture,  design  structure,  design 
allocation,  and  design  trade-ofO-  The  SDFs  not  only 
provide  a  mechanism  to  capture  the  system  attributes  but 
also  allow  them  to  be  related. 

SDFs  can  also  be  viewed  as  a  communication  path 
between  customers  and  systems  engineers.  In  general, 
system  engineers  must  be  able  to  express  and  prioritize  the 
customers  ’  criteria.  These  criteria  are,  in  turn,  annotated 
through  the  design  factors  and  are  used  as  a  guideline  for 
the  design  team.  By  considering  these  factors  early  and 
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throughout  the  development  process,  the  design  team  can 
avoid  both  bad  designs  and  design  that  does  not  meet  the 
requirement  to  reduce  costs,  and  to  optimize  productivity 
lHHN90a],  IHHN90b]. 

Each  SDF  has  one  or  more  metrics  defined.  In 
addition,  when  associated  with  a  design  element,  an  SDF 
may  have  more  than  one  measurement.  The  metric 
describes  manners  in  which  the  factor  might  be  measured. 
Metrics  can  be  derived  from  (1)  past  experience,  (2) 
results  of  simulation/analysis/prototyping,  (3)  actual 
system  measurements,  (4)  other  SDF  measurements,  or 
(S)  other  SDF  metrics.  By  using  other  SDF  metrics  or 
measurements  to  define  a  SDF  metric,  SDFs  are  defined 
in  a  hierarchical  maimer.  For  example,  a  higher  level 
Real-Time  Performance  (RTP)  SDF  can  be  derived  from 
two  other  SDFs:  Deadline  Success  Rate  (DSR),  and 
Deadline  Criticality  (DC).  The  metric  for  RTP  might  be 
a  mathematical  formula  related  DSR  and  DC;  e.g.,the 
product.  Metrics  may  also  be  related  to  the  hierarchical 
specification.  For  example,  in  a  fimctional  decomposition, 
the  Total  Real-Time  Performance  of  some  Design 
Element,  TRTP(de)  may  be  the  product  of  decomposition 
components  of  de  as  shown  in  EQ  1. 

EQ  1:  TRTP(de)  =  HRTPfDDE) 

such  that  DDE  is  the  set  of  Design  Elements 

which  are  a  decomposition  of  Design  Element  de. 

It  has  been  previously  shown  that  based  on  the 
design  goal  and  design  parameter,  the  engineer  can  tailor 
the  single  criteria  or  multi-criteria  objective  function  for 
optimization  [NaF91]. 

SYSTEM  LEVELOPTIMIZATION  ISSUES 

A  template  of  SDF  is  presented  in  Figure  1. 
This  is  a  super  class  definition  of  SDF.  The  purpose  of 
this  template  is  to  provide  a  general  format  to  guide  the 
systems  engineer  or  the  customer  in  the  application  of  the 
SDF.  It  assists  the  engineer/customer  in  specifying  the 
goals/criteria  to  be  measured.  As  a  system  design 
matures,  SDF  class  types  are  defined,  a  class  hierarchy  of 
SDF  is  generated  and  instantiated  SDF  'bbjects  *  are  used 
to  annotate  particular  design  elements. 

Currently,  there  are  twelve  items  in  the  template. 
(1)  Name  is  an  unambiguous  name  of  the  SDF.  (2) 

Type  is  a  classification  of  the  SDF.  (3)  Range  is  either 
the  minimum  and  maximum  values  or  the  cardinality  of 
the  SDF.  (4)  Units  is  the  unit  of  measurement  of  the 
SDF.  (5)  Methods/Principle  lists  approaches  or  techniques 
that  the  designer/customer  considers  to  affect  changes  in 
the  factor ’s  value.  (6)  Rationale  lists  the  reasons  that  this 
factor  applies  to  design  elements  and  specifies  which 
design  element  types  are  appropriate.  (7)  Relationship 
lists  other  SDFs  that  are  closely  associated.  The 
Relational  Expression  field  in  this  item  lists  types  of 
associations  for  each  Relationship.  (8)  Quantification  is 
divided  into  Type  and  Metrics  fields.  The  Type  field 


describes  value  types  of  quantifications  such  as  integer, 
float,  double,  short,  or  long.  The  Metrics  field  describes 
how  measures  are  determined.  Once  instantiated,  the 
Metrics  field  also  holds  values  measured.  A  description  of 
this  aspect  of  the  SDF  is  described  in  more  detail  in  an 
earlier  section.  (9)  Consistency  Rules  list  rules  to 
determine  consistency  of  the  SDF  in  various  design 
specifications.  Rule  types  include  By- Aggregation,  By- 
Type,  By-Design  Factors,  By-View,  and  By-Component 
rules.  For  example,  the  By-Aggregation  field  provides  a 
slot  that  holds  the  rule  for  governing  this  factor 
consistency  throughout  the  hierarchy  (e.g..  Use  Rule  X 
and  Rule  Y).  (10)  Reference  is  the  source  or  reference  of 
factor  which  may  be  a  publication  or  simply  the  ruune  of 
the  designer  that  formulated  this  SDF.  (11)  Definition 
provides  a  text  book  style  definition  of  the  SDF.  (12) 
Annotation  provides  areas  for  free  form  comments 
relevant  to  the  SDF.  The  Annotation  may  include  further 
SDF  backgroimd  information  or  provide  warnings  related 
to  the  SDF. 


THE  FRAMEWORKAND  REENGINEERING 

The  framework  proposed  addresses  many  of  the 
critical  issues  described  in  the  abstract.  It  provides  the 
ability  to  capture  and  maintain  more  complete  system 
information.  Since  the  key  issue  in  reengineering  is  the 
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ability  to  understand  the  functionality,  the  behavior,  and 
the  implementation  of  the  existing  system,  the  design 
capture  views  framework  naturally  supports  this  process. 
Depending  on  the  level  of  reengineering,  information 
about  the  existing  system  can  be  specified  in  the 
appropriate  capture  views.  Supporting  various  levels  of 
detailed  descriptions  as  well  as  multi-levels  of  design 
abstraction,  the  capture  views  provide  a  framework  for  the 
whole  spectrum  of  reengineering.  For  example,  for  the 
reuse  of  existing  software,  the  functionality  of  the  existing 
code  can  be  captured  in  the  software  architecture  of  the 
implementation  view.  This  software  architecture  provides 
a  better  understanding  for  the  software  function  of  the 
existing  code  which  can  then  be  assessed  for  its  use  in  new 
required  applications.  For  the  reengineering  at  system  or 
subsystem  level,  the  functionality,  the  behavior,  and  the 
implementation  of  the  existing  subsystem  (or  system)  can 
be  captured  by  the  functional,  behavioral  and 
implementation  capture  views. 

With  the  support  of  the  SDFs  in  describing 
various  DEs  of  the  capture  views,  the  design  of  an  existing 
systems  can  be  evaluated  for  reuse  and  integrated  into  new 
and  evolutionary  designs.  Existing  designs  can  be 
evaluated  independently  or  as  part  of  a  new  design.  The 
capability  to  integrate  old  and  new  designs  and  analyze 
them  based  on  different  criteria  allows  the  employment  of 
reengineering  to  be  properly  evaluated  and  justified. 

With  these  framework  advantages,  the  design 
space  considered  by  system  designers  could  be  broadened. 

In  addition,  systems  will  be  more  easily  evolve  from  the 
design,  with  modification  alternatives  quickly  determined 
and  assessed.  This  should  be  true  whether  the  changes 
occur  due  to  a  top-level  requirements  change  or  an 
implementation-level  technology  insertion. 

SUMMARY  OF  CURRENT  STATUS 

Definitions  and  examples  of  usage  of  the  multi- 
domain  system  design  capture  methodology  have  been 
documented  in  several  reports  [HKH91],  [Hoa91], 
[HoK92}.  Much  work  still  needs  to  be  done  in  the 
development  of  specific  representations  within  each  of  the 
capture  views  especially  within  the  Behavior, 
Implementation  and  Environmental  Capture  Views. 
Currently,  the  study  of  the  relationship  and  transition 
between  these  capture  views  is  only  focused  on  the  link 
between  the  Functional/Behavioral  and  Implementation 
Views  [Hoa91].  Near-term  goals  also  include  the  study  of 
the  transition  between  design  capture  and  design 
evaluation.  Transformation  techniques  must  be  able  to 
preserve  all  information  from  design  capture  such  that 
analysis  results  will  reflect  a  correct  system. 

Automation  support  for  the  generation  of  the 
capture  views,  transition  between  them,  and  transition 
from  design  capture  to  design  evaluation  is  a  critical  issue 
for  the  success  of  the  employment  of  this  method.  The 


evaluation  of  current  CASE  tools  (i.e..  Teamwork, 
Software  Through  Pictures,  Statemate,  RDD-100)  and 
simulation  tools  (i.e.,  ADAS,  SES  Workbench,  Bones) 
shows  that  they  do  not  provide  complete  representation 
and  evaluation  capabilities.  One  immediate  solution  is  the 
integration  of  various  tools.  However,  more  research  is 
needed  to  provide  a  seamless  support  environment  for 
system  design. 

The  relationship  between  SDFs  is  not  well 
understood  at  the  present  time,  but  there  are  attempts  to 
correlate  these  factors  as  this  effort  progresses.  SDFs  for 
security  and  dependability  aspects  of  the  system  are  being 
developed  by  related  efforts  and  the  use  of  SDFs  within  a 
multi-view  specification  of  the  system  is  ongoing.  These 
factors  are  intended  to  be  used  throughout  and  are  critical 
to  the  entire  system  engineering  process.  For  instance, 
they  are  used  to  q^ecify  in  the  requirements  phase, 
encapsulate  in  the  capturing  phase,  quantify  and  evaluate 
in  the  analysis  i^iase,  characterize  in  the  optimization 
phase  and,  justify  in  the  design  trade-off  phase. 

A  software  tool  called  DESTINATION  (Design 
Structuring  and  Allocation  Optimization)  has  been 
developed  to  explore  issues  described  in  this  piq>er. 
DESTINATION  has  been  linked  to  commercial 
qiecification  tools  and  is  being  linked  to  simulation  and 
scheduleability  assessment  tools.  In  the  future, 
DESTINATION  ivill  be  linked  to  reengineering  tools. 
DESTINATION  provides  SDF  annotation  and  resource 
allocation  optimization.  Design  guidance  capabilities  are 
in  development. 

The  future  plans  include  refining,  restructuring 
and  streamlining  (if  necessary)  the  optimization  of  designs 
using  Multi-View  (Capture  and  SDFs.  A  dedicated 
research  effort  is  considering  a  small  but  widely  used  set 
of  design  factors.  Given  formalisms  are  currently  lacking 
and  single  point  solutions  are  being  used  for  the 
Environmental  and  Implementation  Capture  Views.  In 
addition,  research  into  providing  robust  capture  methods 
for  these  areas  will  be  explored.  The  formulation  vrill  be 
incorporated  into  a  sonar  [Hoa91]  and  other  applicable 
examples. 

Together,  the  multi-view  capture  and  optimization 
based  on  SDFs  provide  a  framework  for  the  results  of 
real-time  system  simulation/analysis  to  view  and  manage 
a  total  system  engineering  process. 
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Abstract 

The  complexity  and  the  volatility  of  requirements 
for  large  scale  software  systems,  and  the  vast  in-place 
investments,  make  evolution  a  necessity.  With  evolution 
there  is  risk  that  critical  user  objectives  will  not  be  met 
within  schedule  and  funds  budgeted  for  new  capabilities. 
To  assure  that  mission  needs  are  satisfied,  evolutionary 
process  must  be  strongly  influenced  by  users,  but  also 
must  be  carefully  controlled  by  buyers  and  efficiently 
carried  out  by  builders.  This  paper  ^  discusses  how 
architectural  approaches  add  value  by  providing,  from 
multiple  perspectives,  a  vision  of  objectives  that  is 
understandable  to  users,  buyers,  and  builders  alike.  It  then 
describes  how  these  same  tqiproaches  also  offer  a  means  to 
organize  the  evolutionary  engineering  activity  around 
information  needs  for  the  target  system.  The  paper 
illustrates  this  and  shows  that  use  of  an  "information 
architecture"  helps  to  assure  that  results  meet  ultimate 
mission  needs  by  focusing  engineering  activities  on  the 
needs  of  the  end  user. 

1  Introduction 

Continuing  evolution  of  large-scale  software¬ 
intensive  systems  provides  the  context  for  this  paper. 
Even  when  precedents  can  be  used  to  foster  common 
understanding  of  objectives  and  methods,  large-scale 
systems  pose  exponentially  greater  difficulty  in 
communication  than  is  the  case  in  smaller  systems.  We 

1  This  woric  is  sponsored  in  pan  by  the  U.  S.  Oepanment  of  Defense. 
The  views  and  condusions  contained  in  this  document  are  solely  those  of 
the  author  and  should  not  be  inteipreted  as  rmresenting  official  policies, 
eilher  expressed  or  implied,  of  the  Software  ragineering  Institute, 
Carnegie  Mellon  University,  the  U.S.  Air  Force,  the  U.S.  Depanment  of 
Defense,  or  the  U.S.  Govemment 

2  The  ooncepu  presmted  in  this  paper  are  based  partly  upon  ptsi 
experience  in  maintaining  large  sciue  software  systems  for  satellite 
ground  dau  processing,  and  based  partly  upon  recent  woilc  at  the  SEI 
reUled  to  a  mission  critical  evolutionary  development  program. 


introduce  the  notion  of  perspective  as  a  means  to  mediate 
this  difficulty.  In  art,  this  allows  appreciation  of  an  artist's 
viewpoint  Perspective  in  engineering  also  helps  create 
common  viewpoints  on  which  to  optimize  communication 
among  users,  buyers,  and  builders. 

Object-oriented  views  offer  users  of  software 
systems  a  means  to  deal  with  complex  software 
abstractions  in  familiar  terms.  In  addition,  object- 
orientation  provides  builders  a  means  to  reuse  artifacts 
efficiently.  Realizing  Cook's  [16]  first  principle  of  object 
technology  (i.e.:  hiding  the  data  behind  the  processing) 
also  requires  an  information  structure  within  which  useful 
and  feasibly  produced  implementation  objects  are 
identiHed  and  elaborated.  These  differing  aspects  of 
object  orientation  suggest  the  levels  of  architectural 
information  that  are  needed  to  successfully  evolve 
systems. 

Evolutionary  reengineering  is  iterative  and 
incremental.  The  continuing  challenge  is  to  determine 
how  to  know,  and  when  it  is  mie,  that  the  interim  results 
are  preceding  most  rapidly  toward  the  ultimate  mission 
needs  of  the  user.  Insight  from  consideration  of  the 
object-oriented  and  architectural  bases  for  software 
evolution  offers  a  model  that  provides  the  lev^ge  needed 
to  answer  these  questions. 

Software  architectures  are  conceptual,  intangible, 
and  abstract.  Yet  if  they  are  to  guide  software  engineers, 
they  must  be  concrete,  visible,  and  as  obvious  as  possible. 
To  achieve  this  goal,  we  apply  the  notion  that  one  learns 
best  while  doing.  From  this  comes  the  conclusion  that 
architecture  is  most  effective  in  communicating  vision 
when  it  is  thoroughly  integrated  into  the  evolutionary 
process  itself.  Architectural  insight  clearly  provides  a 
framework  for  that  process.  By  inverting  the  title  of  Best's 
article  ("If  They  Built  Buildings  the  Way  They  Build 
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Software")  we  suggest  a  way  to  define  a  practicable,  and 
pragmatic  paradigm  for  applying  architectural  concepts 
toward  reengineering  large  scale  software  systems  [11]. 

Evolutionary  reengineering  process  mediates 
problems  in  dealing  with  large-scale  systems  but  fails  (1) 
without  user-oriented  direction  and  control.  (2)  without  a 
process  framework  that  assures  that  mission  needs  ate  met, 
and  (3)  without  consistent  engineering  methods  and 
evaluation  paradigms.  Gelemter's  insight  that  a  different 
way  of  viewing  large  scale  systems  (i.e.:  that  software 
"shadow  systems"  are  needed  to  track  the  evolution  of 
actual  systems)  is  the  beacon  followed  in  this  paper  for 
identifying  technology  needed  to  enable  evolutionary 
reengineering  [23]. 

An  architectural  framework  for  evolution  should 
offer  (1)  direction  based  on  vision  clearly  understood  by 
both  users  and  builders;  (2)  controls  that  reflect  knowledge 
of  both  legacy  systems  and  problem  domain;  and  (3) 
correlated  process  framework,  engineering  methods,  and 
evaluation  paradigms. 

The  thesis  of  this  paper  is  that  an  information 
architecture  provides  this  framework.  To  develop  this 
position,  the  use  of  architectural  concepts  is  discussed,  and 
hypotheses  are  offered  regards  applying  them  for 
evolution.  Architectural  views,  combined  with  object- 
oriented  concepts,  prevent  the  failure  modes  identified  for 
evolutionary  reengineering. 

Terms  of  reference  are  offered  to  frame  the 
discussion.  Architectural  concepts  that  add  value  to  the 
evolutionary  process  are  introduced  in  the  context  of 
object  oriented  techniques.  A  set  of  five  architectural 
propositions  for  evolving  systems  suggest  how 
architectural  approaches  contribute  to  software 
engineering  activities. 

Discussion  of  engineering  activities  then 
illustrates  how  object-oriented  considerations  and  levels  of 
architectural  information  help  assure  engineering  results 
that  meet  long-term  mission  needs.  The  value  of 
information  architecture  in  guiding  this  activity  is 
summarized  in  a  Tive-point  conclusion. 

2  Concepts  and  Deflnitions 
2.1  Architecture 

The  term  "architecture"  is  used  in  this  paper  in 
the  context  of  abstract  systems.  Architecture  reflects 
many  kinds  of  structure  and  is  a  basis  for  several  kinds  of 


reasoning.  Ideas  presented  are  influenced  heavily  by 
tutorials  and  presentations  from  M.  Shaw  [S3],  and  J. 
Zachman  [64,65]  and  by  papers  from  D.  Perry  [44]  and  B. 
Gaines  [21].  Saunders  [49,  50]  and  Horowitz  [28]  offer 
views  of  the  nature  of  software  architecture  for  use  in 
support  of  software  acquisition.  Anderson's  ideas  [5]  that 
architectural  concepts  are  the  structuring  paradigms  of 
software  systems,  and  that  envisioning  a  product  family 
involves  creating  an  appropriate  architecture,  are  also 
helpful  in  obtaining  an  operational  understanding  of  what 
the  term  architecture  means  to  software  engineering. 
Jones'  [31]  definition  of  architecture  offers  the  following 
key  points: 

•  a  structure  of  elements  of  known  properties, 

•  which  have  rules  of  interaction  and  relations 

•  and  provide  a  basis  for  reasoning 

2.2  Evolution 

The  notion  of  "evolution"  is  used  in  a  literal 
sense.  As  a  "series  of  changes",  evolution  must  have  a 
starting  point  (legacy).  Having  "a  certain  direction", 
evolution  differs  from  an  unconstrained  series  of 
modifications  that  may  have  little  direction  over  the  long¬ 
term.  It  also  differs  from  development,  which  has  both 
direction  and  precisely  specified  results.  Webster  defines 
evolution  as 

•  a  series  of  related  changes  in  a  certain  direction 
23  Object 

In  this  paper,  the  literal  definition  of  "object"  is 
broadened  by  the  notion  of  "cognitive  apprehension", 
versus  the  more  literal  "visible",  and  extended  by 
including  software  engineering  notions  re  object 
capabilities.  This  produces  the  definition  that  follows: 

•  a  tangible  or  cognitively  apprehensible  thing,  having 

•  discrete  boundaries,  and  having 

•  intrinsic  information  and  state,  and 

•  capabilities  including  communications 

2.4  System 

The  idea  of  "large  scale  software  system"  is 
central  to  this  paper.  In  the  literal  definition  of  "system", 
the  key  idea  used  is  "aggregation  of  objects".  An  extract 
from  Webster  brings  this  out. 

•  a  complex  unity  of  diverse  pans 

•  serving  a  common  purpose 


•  an  aggregation  of  objects 

•  joined  in  regular  interaction  or  interdependence 

3  Leverage  for  Reengineering 

Using  an  object-oriented  approach  to  identify, 
elaborate,  and  incrementally  realize  a  software  architecture 
provides  leverage  for  several  aspects  of  rereengineering 
large  software  systems.  Both  the  object-oriented  and  the 
software  architectural  approaches  improve 
communications  about  the  system  by  reducing  technical 
and  mission-oriented  goals  to  familar  aggregates  of 
computer  processing,  user  functionality,  and  mission  data. 
Both  approaches  support  evolution  of  new  capabilities 
versus  q)ecialized  development  of  integral  functions.  The 
combination  of  the  two  approaches  facilitates  efficient 
integration  of  the  improved  system  capabilities  which 
satisfy  ultimate  user  needs. 

3.1  Object-Orientation 

The  Farringdon  Group  report  on  systems 
productivity  with  an  object-oriented  paradigm  cites 
disintegration  of  manual  user  functions,  inability  to 
modify  newly  automated  functions,  and  complications  in 


accessing  and  sharing  data  as  the  intransigent  problems  of 
classical  computer  systems  development.  The  object- 
oriented  approach  to  systems  development  is  neither 
object-oriented  programming,  nor  object-oriented  design, 
nor  object-oriented  analysis.  This  approach  applies  the 
concepts  underlying  these  object-oriented  disciplines  so  as 
to  realize  systems  that  are  designed  for  modification, 
structured  around  real  world  objects,  and  composed  of 
self -organizing  components.  The  underlying  principles  of 
object-orientation,  which  are  encapsulation,  information 
hiding,  classification,  inheritance,  and  polymorphism  are 
applied  in  this  object-oriented  systems  approach  to  build 
readily  evolvable  parcels  of  software  whose  functions  and 
data  are  localized;  which  inuinsically  reflect  real  world 
objects;  which  avoid  artificial  division  of  enterprise 
activities  into  "automation  functions”;  which  rely  on 
"intelligent  data  packages";  and  which  emphasize  systems 
development  as  a  series  of  iteratively  refined  modeling 
activities.  This  latter  emphasis  is  inherently  evolutionary 
in  that  development  of  new  applications  is  an  implicit 
result  of  continuing  modeling  of  the  attributes  and 
operations  of  new  and  refined  objects  of  the  mission  or 
enterprise.  In  the  Farringdon  Group's  object-oriented 
systems  development  paradigm,  the  associated 
development  environments  and  applications  architectures 
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Figure  1  -  A  Layered  Architectural  View  of  Generic  System  Capabilities 
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aie  themselves  similarly  integrated  elements  (e.g.:  class 
browsers  and  debuggers,  subsystem  prototypes,  and 
federated  platforms  and  information  infrasuiictures)  of  the 
large  scale  object  oriented  system  [19]. 

Capretz  examined  methodological  aspects  of  the 
object-oriented  paradigm.  He  cites  instances  of 
jeopardized  traceabilty  of  requirements  when  object- 
oriented  and  structured  methods  are  combined,  and 
strongly  recommends  use  of  object-oriented  design  and 
analysis  methods  so  as  to  assure  realization  of  object- 
oriented  systems.  However,  he  found  need  for  more 
experimentation  before  a  large  scale  software  system  can 
be  developed  without  risk  with  use  of  an  object-oriented 
approach,  and  he  strongly  advocates  the  pursuit  of 
improved  object-oriented  analysis  and  design 
methodologies  [12]. 

Shelton  reports  the  cridcality  of  using  an  iterative 
development  approach  to  guide  implementation  of  an 
object-oriented  methodology.  He  cites  distinct  needs  for 
enterprise  (mission-oriented)  models,  for  operational 
models  that  have  application  independent  components,  and 
for  implementation  models  (blueprints  for  service  layer 
classes  of  objects),  as  well  as  for  integration  models  that 
are  class  level  physical  designs  for  object  implementation. 
Where  legacy  systems  are  available  for  integration,  these 
latter  integration  models  deviate  from  implementation 
models  and  become  the  means  for  integrating  legacy 
components  and  features  [S4]. 

Analysis  techniques  link  the  object-oriented 
paradigm  to  architecture-based  development  approaches. 
Coad  details  procedure  for  object-oriented  analysis.  It  is 
based  upon  recognition  of  three  intuitive  analytical 
approaches:  differentiation  of  experience  into  objects  and 
attributes;  distinguishing  between  whole  objects  and  their 
component  parts;  and  formation  of  and  distinguishing 
among  classes  of  objects  [IS]. 

In  the  manufacturing  domain.  Goad’s  object- 
oriented  modeling  paradigms  have  been  applied 
successfully  to  develop  process  planning  and  assembly 
control  software  applications.  The  generality  of  these 
object-oriented  applications  has  led  to  practical  definition 
of  subsystem  software  architectures  [4]. 

Architectural  Leverage 

The  architectural  approach  for  software 
engineering  of  large  scale  systems  is  being  researched  by 
the  Advanced  Research  Project  Agency  (ARP A)  program 
in  Domain  Specific  Software  Architectures  (DSSA). 


Different  approaches  have  been  examined.  One  of  these  is 
oriented  toward  exploiting  domain  expertise.  This  is 
represented  in  a  domain  model  that  is  independent  of  any 
implementation.  Another  approach  is  based  upon 
representation  of  a  particular  architectural  style  associated 
with  a  hierarchy  of  conuol  devices.  A  third  approach 
relies  upon  formal  engineering  models  of  domain 
dependent  computations.  However,  all  of  these 
approaches  provide  a  basis  for  a  software  architecture  that 
supports  both  focus  on  and  resolution  of  design  decisions, 
and  which  becomes  a  framework  for  development  support 
tools.  As  a  result  of  this  research,  common  languages  for 
evolution  of  domain-specific  applications,  and  frameworks 
for  software  reuse  are  becoming  available  [41]. 

Perry  emphasizes  that  the  three  chief  values 
added  by  the  architectural  approach  are  significantly 
greater  support  for  software  reuse,  increased  support  for 
generational  reuse,  and  insight  into  the  nature  of  principles 
for  composition  of  software  systems  [45].  Hayes-Roth  et 
al.  report  that  their  development  of  architectural 
approaches  for  software  systems  development  has 
enhanced  their  ability  to  provide  knowledge-based, 
artificially  intelligent  tool  suppon  for  the  process  of 
developing  controller  applications  [26].  Agrawala  et  al. 
report  that  their  research  relies  extensively  on  formal 
models  with  which  an  open  toolset  and  an  layered 
architecture  are  used  to  increase  applications'  reliablity, 
real-time  performance,  and  fault-tolerance  [3]. 

In  advancing  the  understandabilty  of  evolving 
systems,  architectural  representations  are  of  particular 
value.  Certain  capabilities  are  common  to  nearly  all 
systems.  These  can  be  viewed  as  generic  atuibutes  of 
typical  system  functions  via  a  layered  suucture  of  related 
generic  capabilities.  Figure  I  presents  a  set  of  layered 
slices  of  system  capability  as  concentric  rings  of 
increasing  relevance  to  user  operations  from  the  outer  to 
the  innermost.  These  layers  themselves  are  arbitrary;  the 
aim  of  the  diagramming  technique  is  to  facilitate 
understanding  of  the  system  [21].  The  layering  approach 
to  modeling  evolving  systems  is  also  applied  to  formal 
representations  that  are  well  suited  for  development  of 
tools  which  can  automatically  provide  a  capability  for 
program  restructuring  [25]. 

33  Enabling  Evolution 

The  layout  of  the  concentric  rings  in  Figure  1  also 
suggests  relationships  within  the  system  that  will  allow 
isolation  of  locales  in  which  technological  upgrades  can 
proceed  independently  from  changes  in  mission  or 
functionality.  These  locales  are  sites  for  technology 
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growth.  Analyzing  them  identifies  paths  for  technology 
transition  that  are  essential  for  successful  evolution. 

Evolution  of  system  capabilities  is  further 
advanced  by  definition  of  intra-object  schemas. 
Mittermeir  proposes  a  semantic  roadmap  to  an  object's 
subcomponents,  and  a  service  channel  into  an  object. 
Through  these  paths,  high  level  modification  operations  on 
aspects  of  the  structure  of  the  object  can  be  performed 
[42].  Limited  implementations  of  this  kind  of  capability 
currently  exist  in  commercial  software  architectural 
frameworks  which  enable  development  and  operation  of 
object-oriented  woikstation  applications  [18]. 


factory.  Another  aspect  of  enabling  evolution  involves 
support  for  prototyping.  Maxim  et  al.  describe  an  object- 
oriented  design  tool  which  permits  non-programmers  to 
construct  high  performance  configuration  design  systems 
graphically  from  a  library  of  reusable  mechanisms.  His 
experimental  system  also  includes  tools  to  support  analysis 
of  the  performance  of  the  resulting  code  [40). 

The  utility  of  an  architecture  for  enabling 
evolution  of  an  acceptable  systems  design  is  greatly 
enhanced  if  its  perspectives  completely  span  the  range  of 
systems  of  interest  to  the  user.  The  smallest  set  of  such 
perspectives  has  members  that  are  completely  uncorrelated 
with  one  another. 


Using  life  cycle  products  of  previous 
developments  is  the  essence  of  evolution.  Basili  and 
others  have  defined  a  reference  architecture  for  a  software 
factory  which  enables  the  derivation  of  specific 
architectural  instances  [7].  The  potential  of  this  software 
factory  for  supporting  evolution  of  software  system 
artifacts  is  presented  as  a  case  study  of  a  Toshiba 
Corporation  development  organization  which  produces 
application  programs  for  manufacturing  process  control 
systems.  A  high  degree  of  productivity  is  rqmrted  to  have 
resulted  from  reuse  of  over  50%  of  code  artifacts.  The 
lack  of  models  of  experience  functions,  such  as  result  from 
the  object-oriented  modeling  paradigms  described  above, 
is  seen  as  the  limiting  factor  in  the  productivity  of  this 


For  software  systems,  basis  architectural 
perspectives  should  address  topology,  behavior,  function, 
and  information.  In  the  list  that  follows,  these 
perspectives  are  represented  by  models  of  varying 
fmeness.  This  fineness  reflects  composition  relationships 
among  system  elements,  objects,  data  entities  and 
operations,  and  similar  relations  among  system 
components,  processes,  tasks,  messages  and  calls.  Shaw 
[52]  addresses  the  composition  of  systems  from 
subsystems  in  which  codification  of  architectural  features 
becomes  a  similar  basis  for  developing  formal  system 
specifications. 
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Figure  2  -  A  Sample  Decomposition  of  System  Components 
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Organizing,  rationalizing,  and  rectifying  systems 
models  from  architectural  perspectives  helps  assure 
complete  and  consistent  reflection  of  all  key  system 
properties  for  an  evolutionary  software  engineering 
process.  The  following  list  of  diagrams  illusuaies  several 
different  perspectives  hrom  which  a  software  architecture 
can  be  modeled. 

•  Generic  Framework  Model 

(topology  of  elements) 

•  Control  How  Model 

(operations  or  activities) 

•  Data  Flow  Model 

(data  transformations) 

•  Dynamic  Behavioral  Model 

(state  transitions) 

•  Object  Model 

(element  attributes;  relations) 

•  Object  Interaction  Model 

(messages,  calls,  callbacks,  flags) 

•  Functional  Flow  Model 

(components,  processes,  tasks) 


3.4  Understanding  System  Properties 

The  rules  of  interaction  between  elements  of  an 
architectural  model,  and  the  properties  of  those  elements, 
are  themselves  captured  in  the  above  models  of  static  and 
dynamic  aspects  of  a  system.  Software  systems  usually 
have  at  least  two  kinds  of  properties:  communications 
properties  and  computational  properties.  In  terms  of  these 
two,  a  general  decomposition  of  the  component  structure 
of  a  system  can  be  readily  obtained  with  another  layering 
approach. 

Computational  layers  may  include  a  shell  that 
relates  computations  to  the  user,  a  functional  layer,  and  a 
supporting  or  infrastructural  layer  through  which  theory, 
mechanisms,  and  operating  environment  are  linked  and 
engaged.  Communications  layers  may  address 
interchange  within  system  elements,  among  those 
elements  and  outside  the  system. 

Figure  2  illusuates  the  decomposition  of  an 
arbitrary  system  into  categories  of  components  that  are 
indicated  by  pairs  of  system  properties  from  wiihin  the 
above  layers  [60].  These  categories  of  components  are 
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Figure  3- A  View  of  Integration  Levels  of  Object-Oriented  Systems 
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inherently  understandable  to  both  builders  and  users.  Lane 
[35]  identifies  three  other  categories  of  components 
(applications-specific,  shared  user-interface,  and  device- 
dependent)  which  are  useful  in  describing  structural 
alternatives.  In  his  work  [34]  with  architectures  for  user 
interface  software,  he  identifies  a  design  space  of  25 
functional  and  19  structural  dimensions  for  which 
alternative  component  designs  can  be  selected  to  form  a 
user  interface  software  architecture. 

Characterization  of  systems  components  via  this 
sort  of  layered  decomposition  can  help  users  and  builders 
uncover  and  prioritize  required  features  and  performance 
factors  in  terms  that  are  mutually  intelligble. 

An  object-oriented  system  architecture  uses 
several  different  kinds  of  objects  to  convey  system 
properties,  to  describe  the  fine  and  coarse  structure  of 
design,  and  to  enable  reasoning  about  various  system 
properties. 

3J  Integration  Layers 

Four  kinds  of  objects  are  typically  found  in 
software  systems.  Gaines  [21]  identifies  these  objects  as 
belonging  to  the  four  levels  of  integration  depicted  in 
Figure  3.  The  most  elementary  or  fine-grained  objects 
deal  with  typical  code-level  implementations  such  as  for 
device  drivers.  At  the  next  higher  level,  protocol 


standards  may  be  reflected  in  objects  that  provide  generic 
processing  interfaces  for  communications.  Presentation  of 
systems  outputs  requires  different  kinds  of  protocols  for 
data  bases  and  graphical  user  interfaces.  Finally, 
functional  applications  can  be  composed  of  objects  whose 
apprehensibility  derives  directly  from  the  problem  domain. 

The  Object  Management  Architecture  [56] 
represents  one  object-oriented  means  to  facilitate  efficient 
integration  of  software  objects.  It  defmes  means  for  inter 
object  communication  and  identifies  operations  that  all 
classes  must  support,  and  common  objects  that  are  useful 
in  wide  ranges  of  applications. 

Evolutionary  integration  of  an  object-oriented 
system  proceeds  from  the  bottom  up.  providing  an 
integrated  base  of  protocol  and  implementation  objects 
that  can  be  used  for  testing  higher  level  constructs.  Lorin 
[39]  notes  the  necessity  of  design  method  that  supports 
instances  of  bottom  up  effort  where  super  classes  are 
extruded  from  sets  of  smaller  base  objects.  This  is 
effectively  a  way  to  discover  the  structure  of  a  solution 
from  its  basic  components.  At  the  same  time  a  top  down 
integration  provides  system  level  capabilities  for 
prototyping  and  user  evaluation. 

3.6  Potential  Value  Added 

From  the  foregoing  discussion,  several 
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(rt>servations  can  be  made  about  the  potential  value  to  be 
added  to  reengineering  of  large  scale  systems  by  using  an 
object-oriented  and  architecture-based  approach. 

Architectural  views  of  generic  system  capabilities 
improve  user  understanding  of  systems,  and  help  builders 
pl^  better  technology  growth  paths.  Garlan's  experience 
with  a  course  in  software  architecture  testifies  that 
architecture 

•  increases  the  shared  understanding  of  high  level 
relathionships  in  systems, 

•  facilitates  development  of  new  variations  of 
previous  systems,  and 

•  allows  the  software  engineer  to  make  principled 
choices  among  design  alternatives.  [22] 

Zachman  recognized  the  value  of  multiple  views 
for  improving  information  system  development  [65].  His 
model  suggested  a  perspective-based  architectural 
approach  for  re-engineering  of  information  systems  [36]. 

Generalization  and  specialization  of  systems  are 
easier  when  systems  knowledge  is  categorized  and 
organized  in  an  architectural  fashion. 

Characterization  of  systems  components  via 
layered  decomposition  helps  uncover  and  prioritize 
required  features  and  performance  factors. 

Different  kinds  of  objects  in  a  system  will  have 
different  scope.  This  corresponds  to  the  architectural  layer 
at  which  objects  are  integrated. 

Organizing,  rationalizing,  and  rectifying  system 
models  from  architectural  perspectives  helps  assure 
complete  and  consistent  reflection  of  key  system 
properties  for  an  evolutionary  software  engineering 
process. 

The  above  observations  regarding  the  value 
added  by  architectural  approaches  for  evolving  systems 
can  be  summarized  in  the  following  propositions; 

•  Evolutionary  direction  is  guided  by  architectural 
vision  and  facilitated  by  robust  legacy. 

•  Representations  of  structure  promote 
understanding  of  generic  capabilities. 

•  Architectural  views  frame  both  system  specific 
and  system  generic  characteristics. 

•  Layering  techniques  help  architectural  models 
to  surface  properties  of  system  components. 


•  Object-oriented  architecture  aids  in  prototyping 
user  capabilities  from  the  top  down,  and  aids  in  integrating 
systems  platforms  from  the  bottom  up. 

4  Challenges  to  Successful  Evolution 

Evolutionary  reengineering  must  satisfy  user 
needs  within  limits  that  are  posed  by 

(a)  physical  constraints 

(b)  allowable  budgets  for  time  and  money, 

(c)  unknown  problem  domains 

(d)  volatile  and  poorly  understood  requirements 

(e)  current  technology 

Misjudgments  of  any  of  the  above  factors  can 
misdirect  the  course  of  evolutionary  reengineering  such 
that  resulting  operational  capabilities 

(a)  perform  outside  required  tolerances, 

(b)  consume  more  budget  (of  time  (»'  dollars)  than 
is  available, 

(c)  provide  insufficient  flexibility  for  planned 
future  enhancements, 

(d)  fail  to  include  all  required  features  or  mis¬ 
estimate  priorities  or  other  parameters  of  operations, 

(e)  constrain  future  adaptability  or  present 
performance  by  adopting  unswapable  architectural 
elements  or  underwhelming  technologies. 

Evolutionary  reengineering  needs  to  provide  for 
integrating  future  technology,  and  to  expedite  resolution  of 
physical  and  logical  engineering  decisions  by  building  on 
a  base  of  objects  synthesized  from  legacy  systems.  It  also 
needs  to  leverage  a  base  of  architectural  and  problem 
domain  knowledge.  This  must  yield  sufficient  insight  to 
enable  efficient  model-based  interaction  with  users  to 
validate  operational  requirements.  It  must  also  facilitate 
evaluation  of  prototypes  to  derive  system  functions  that 
are  acceptable  to  users. 

In  guiding  the  path  of  evolution,  pre-resolution  of 
certain  kinds  of  questions  is  essential.  Especially  in 
Command,  Control,  and  Communitations  (C3)  systems, 
the  chief  need  for  evolutionary  development  is  the  fact  that 
not  all  requirements  can  be  understood  or  known  explicitly 
at  the  outset  of  new  development  or  reengineering  effort 
[10].  Salasin  and  Waugh  [47]  observe  that  in  order  to 
effectively  reengineer  systems,  information  is  needed 
about  alternative  processes,  decompositions  and  data 
structures  individually  and  in  terms  of  their  relationships. 
Both  decisions  to  use  a  particular  process  and  dependence 


upon  data  and  other  processes  must  be  explicitly  visible  if 
they  are  to  be  used  effectively  in  evolving  systems. 

These  needs  are  the  basis  for  the  challenges  that 
are  illustrated  in  Figure  4. 

5  A  Model  for  Systems  Evolution 

Systems  evolution  differs  from  systems 
development  and  from  systems  modiflcation.  Small 
modifications,  over  time,  can  create  large  system  changes 
that  may  not  have  had  a  single  direction  or  unifying  intent 
Systems  development  can  start  from  scratch.  However, 
systems  evolution  is  directed  towards  long  term  user 
needs,  and  it  (q)erates  on  the  legacy  of  existing  systems. 
This  legacy  includes  the  target  system  itself,  and  an 
existing  depth  of  knowledge  about  the  problem  domain. 

To  satisfy  users,  a  systems  evolution  process 
must  assure  that  changes  create  enhanced  operational 
capabilities  that  meet  long-term  needs.  Architecture 
inovides  guidelines  for  the  direction  of  evolution,  and 
enables  development  of  constraints  that  organize  the 
evolutionary  process. 

Product  •  line 
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The  consensus  fostered  by  architectural 
representations  enables  early  identification  of  critical 
issues  regarding  the  use  of  existing  technology  and  of  both 
hardware  and  software  mechanisms  and  artifacts.  This 
consensus  similarly  helps  to  surface  details  of  the  problem 
domain  that  require  specialized  analysis  or  that  soess 
system  engineering  or  performance  capacities. 

Successful  evolution  results  when  concerns  lor 
synthesis  of  mechanisms  are  separated  from  concerns  for 
analysis  of  problems,  and  when  those  concerns  are 
handled  in  parallel  activities.  Legacy  mechanisms  are 
inherited  from  legacy  systems  and  from  general  purpose 
design  activities  that  fill  needs  of  multiple  products. 
Problem  knowledge  acquired  from  domain  analysis  offers 
specific  insights  that  permit  the  tailoring  of  applications 
solutions  that  uniquely  meet  user  needs. 

Figure  5  offers  a  model  for  evolving  software 
systems.  It  shows  the  stages  of  the  evolutionary  process 
from  conceptualization  of  the  system  through  evsiiuation 
of  tentative  implementations.  It  differentiates  two  distinct 
aspects  of  the  evolutionary  process  and  the  artifacts  that 
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comprise  the  evolving  system.  One  aspect  reflects 
artifacts  that  are  derived  from  the  legacy  of  prior  systems 
so  as  to  provide  features  of  new  or  changed  systems.  The 
other  aspect  reflects  the  sequence  of  analytical  activity 
through  with  fresh  knowledge  of  a  problem  is  iteratively 
teased  out  of  increasingly  detailed  problem  specifics. 

Rules  of  component  composition  and  ways  to 
standardize  component  interfaces  based  upon  common 
units  of  avionics  components  were  investigated  by  Batory 
and  others  [8].  Properly  staged  composition  and  flexible 
harnessing  of  artifacts  from  both  legacy  sources  and 
domain  knowledge  are  the  essence  of  this  model  for 
systems  evolution. 

6  Information  Architecture  Concepts 

System  evolution  using  an  information 
architecture  to  preserve  a  focus  on  user  needs  is 
"converging  evolution".  This  use  of  an  information 
architecture  leads  to  a  refinement  of  system  functions  that 
satisfy  user  needs.  It  also  facilitates  the  identification  of 
generic  interfaces  between  subsystems.  These  permit 
reuse  of  software  components  and  enable  continuing 
reflnement  of  software  mechanisms  via  a  process  of 
successive  replacement. 

The  ongoing  evolutionary  development  of 
Command,  Control  and  Communications  capabilities  for 
the  Ballsistic  Missile  Defense  Organization  (BMDO)  is  an 
evolutionary  development  that  must  consider  the 


reengineering  of  many  existing  systems  in  order  to 
integrate  the  overall  BMC3  capability.  Figure  6 
summarizies  concepts  for  an  Information  Architecture  as 
they  are  viewed  by  conuacior  and  government  engineers 
involved  in  this  activity.  Urban  and  others  have 
elaborated  the  underlying  concepts  for  the  BMC3 
Information  Architecture  [61].  Their  draft  paper  identifies 
this  information  architecture  as  "an  information  model  for 
evolving  the  system." 

Information  architecture  content  provides  terms 
of  reference,  guidance  for  defining  and  supporting 
engineering  process  and  organized  absuactions  in  model 
form  that  represent  structural,  process  and  behavioral 
aspects  of  the  system. 

An  essential  starting  point  for  developing  an 
information  architecture  is  a  representation  of  system 
operating  context  in  terms  of  user  activities  and  the 
external  events  that  stimulate  the  system.  Hufnagel  and 
Harbison  propose  a  methodology  which  emphasizes  these 
user  views  of  a  system  in  their  seamless,  scenario-driven 
object-oriented  approach  [29,30].  The  key  notion  in  their 
woik  is  that  a  meta-linguistic  object-oriented  approach  can 
be  used  efficiently  to  organize  system  objects.  They 
propose  a  domain  independent  and  virtual  specification  of 
systems  based  upon  conceptual  analysis  of  a  systems' 
requirements,  specifications,  and  designs.  Artifacts  of 
such  scenario-based  approaches  are  essential  elements  of 
an  information  architecture. 
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Abstractions  of  attributes  of  system  properties 
and  component  features  are  essential  when  allocations  are 
made  and  derivations  are  worked  out  at  each  level  of 
legacy  synthesis  and  at  each  level  of  problem  analysis. 
Abstractions  of  properties  and  features  help  surface  the 
constraints,  assumptions,  factors,  and  alternatives  needed 
for  thorough  analysis,  specification,  implementation  and 
evaluation.  They  thereby  facilitate  evolution  that  satisfies 
iong-term  user  needs. 

7  Information  Architecture  Methods 

Methods  for  applying  information  architectural 
concepts  address  management,  strategy,  evolutionary 
process  and  tools,  modeling,  representation  and  analysis, 
and  use  of  information  architectures. 

7.1  Management 

The  Air  Force  Science  Advisory  Board  (SAB) 
sponsored  a  1993  summer  study  on  Information 
Architecture  [17].  It  recommended  that  the  Air  Force 
develop  of  an  enterprise  wide  information  architecture. 
This  was  characterized  as  an  enterprise-wide  building  code 
that  is  layered,  open,  and  driven  by  commercial  off-the- 
shelf  (COTS)  considerations.  A  focus  on  common  data 
element  definitions  and  on  applications  interface  standards 
and  conventions  was  also  recommended.  A  process  for 
managing  architecture  development  was  advocated  as  a 
means  to  apply  an  information  architecture  to  both 
administrative  corporate  information  management  (CIM) 
Implications  and  to  tactical  warfare  (mission  critical) 
applications.  Four  facets  of  this  process  are  appropos  for 
implementation  of  any  information  architecture.  These 
facets  are 

(a)  establishing  a  continuous  process  for 
evolving  the  "building  code"  to  meet  changing  needs 
including  compatibility  with  external  organizations. 

(b)  involving  users  and  developers  in  assessing 
and  evolving  this  building  code. 

(c)  impiyins  accountability  the  concept  of 
"central  direction  and  decentralized  execution”  to  the 
architecture  development  process. 

(d)  placing  a  priority  focus  on  developing 

•  enterprise  architecture  and  process, 

•  interoperability, 

•  use  of  COTS  technology, 

•  continuing  utility  assessments  of  standards, 

•  tools  for  defining  /  analyzing  architectures, 

•  tools  for  migrating  legacy  software 


12  Strategy 

Commercial  tool  vendors  recommend  technical 
strategies  for  developing  information  architecuires.  Many 
current  commercial  strategies  address  needs  identified  by 
the  Air  Force  SAB.  Three  of  these  were  advocated  by 
Lock  [37]  at  the  1993  CASE  World  Conference; 

(a)  aim  the  architectural  building  code  at  mission 
functions  and  not  at  implementation  technology  by 

>  establishing  internal  /external  exchange 
protocols 

•  connecting  to  existing  applications  and  data 

•  migrating  first  data,  then  functions 

>  adopting  broad  based  external  standards 

(b)  allow  the  architecture  to  guide  the  building 
environment  with  layered  designs,  and  equivalent 
treatment  of  hardware  and  software  entities,  by  requiring 

•  generalized  application  tool  kits, 

•  widely  usable  application  services, 

•  easily  accessible  production  services,  and 

•  pervasive  infirasmicture  elements. 

(c)  developing  tools  for  design  engineering  and 
performance  evaluation  which  support  reverse 
engineering,  and  integrated  modeling  of  mission,  network, 
and  data  functions. 

7  J  Evolutionary  Process  and  Tools 

Lockman  and  Salasin  describe  an  object-oriented 
approach  to  implementing  a  four  phase  evolutionary 
reengineering  process  [38].  With  these  phases  they 
"appeal  to  intuition”  in  their  advocacy  for  creating  and 
operating  upon  an  object-oriented  dqiiction  of  the  current 
system  to  realize  new  or  extended  features  of  a  similarly 
depicted  target  system.  Their  work  identifies  specific 
steps  are  for  each  of  these  phases,  and  examines  the  needs 
and  opportunities  for  tool  support.  The  intuition  regards 
both  a  transformation-orient^  reengineering  process  and 
the  use  of  an  object-oriented  representation  for  an 
information  architecture,  which  they  offer  in  their  1989 
paper,  is  validated  by  three  current  architecture-based 
software  engineering  environments,  all  of  which  support 
an  evolutionary  reengineering  process.  These  are  SNAP 
[18],  ANSA  [1],  and  DISCUS  [20]. 

SNAP  is  targeted  at  rapid  implementation  of 
client-server  applications  subsystems  which  operate  upon 
workstations  and  networks  of  distributed  processors.  It 
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relies  upon  a  standard  architectural  template  to  organize 
and  support  the  process  of  evolving  new  objects  within 
generic  class  libraries.  These  class  libraries  address 
graphic  user  interfaces,  communications,  data  base  access, 
external  applications,  internal  storage,  and  knowledge 
based  support  functions.  SNAP  has  been  applied 
successfully  with  high  productivity  to  developments  of 
decision,  analysis  support,  and  command  and  control 
systems  in  such  diverse  applications  as  anti-submarine 
warfare  and  air  traffic  control. 

ANSA  is  a  programming  support  environment 
based  upon  an  architecture  that  has  been  represented  with 
enterprise,  information,  computation,  engineering,  and 
technology  models  of  generic  distributed  systems.  This 
architecture  makes  the  fact  of  distribution  transparent  to 
application  builders  and  users,  and  produces  distributed 
applications  subsystems  which  can  be  managed  and 
evolved  as  a  coordinated  whole,  rather  than  as  separate 
black  boxes  with  specialized,  and  potentially  incompatible 
development  paths. 

DISCUS  is  a  generic  reusable  software 
architecture  which  provides  high  levels  of  reusability 
between  tools  and  data  sources.  Its  aim  is  realization  of 
seamless  interoperability,  particularly  for  the  class  of 
workstation  applications  which  involve  significant  image 
manipulation.  It  represents  one  of  several  MITRE 
Corporation  sponsored  efforts  in  the  area  of  evolvable 
systems  development.  Other  similar  systems  / 
architecture-based  software  engineering  environments 
include  EXCITE  (coordinated  sharing  of  information  for 
intelligence  analysis).  DOMIS  (distributed  object 
management  system  for  integrating  legacy  data  bases),  and 
systems  for  support  of  collaborative  computing  and  for 
support  of  distributed  simulations  [9]. 

The  evolutionary  reengineering  process  has  an 
intrinsic  focus  on  design  and  on  redesign.  Few  generic 
tools  have  yet  been  developed  to  support  generic  design 
activities.  However,  the  nature  of  these  activities  is 
becoming  better  understood  to  correlate  well  with  the  use 
of  architectural  information.  Studies  of  the  design  process 
also  suggest  several  information  attributes  that  are  needed 
in  architectural  representations  to  support  typical  designer 
behavior.  Adelson  and  Solloway  report  that,  in  general, 
designers  exhibit  six  characteristic  behaviors  [2].  These 
behaviors  are  as  follows 

(a)  formulation  of  mental  models 

(b)  simulation  of  mechanism  behavior 

(c)  systematic  expansion  of  level  of  detail 

(d)  representation  of  constraints 


(e)  recollection  of  prior  plans  and  strategies 

(0  annotation  of  intermediate  artifacts 

7.4  Modeling,  Representation  and  Analysis 

Computer  Integrated  Manufacturing  (CIM)  Open 
Systems  Architecture  (OSA)  documents  [33]  provide  a 
succinct  explanation  of  concepts  for  modeling  software 
architectures  and  applying  specific  architectural  concepts. 
The  CIMOSA  modeling  concept  shows  how  to  develop 
enterprise  models  in  an  evolutionary  mode,  and  illustrates 
the  impact  of  information  architecture  on  the  evolution  of 
information  intensive  systems.  Requirements  for 
modeling  languages  which  support  integration  of  layered 
models  have  been  identified  by  Gielingh  [24],  whose  work 
also  addressed  evolutionary  reengineering  for  CIM 
applications.  He  argues  that  to  support  layered  modeling, 
information  modeling  languages  must  support  definition 
of  modeling  dimensions  for  specialization,  discrimination, 
and  orthogonalization.  Specialization  expresses  a 
hierarchy  of  concepts,  discrimination  separates  concepts, 
and  orthogonalization  identifies  concepts  which  are 
independent  of  one  another.  Application  of  these 
definitions  helps  realize  abstract  and  layered  architectures 
which  are  truly  evolvable. 

Representation  of  information  architectures 
inherently  involves  both  legacy  and  problem  domain 
information.  Models  can  provide  top-down 
representations  of  architectural  features.  From  a  bottom- 
up  perspective,  component  attributes  also  must  be 
represented.  Tracz  reports  on  the  structure  of  a  design 
record  for  Avionics  Domain  Application  Generation 
Environments  (ADAGE)  [59].  He  details  18  distinct 
elements  of  a  design  record  for  legacy  avionics  software. 
These  include  the  following  dynamically  changing  aspects 
of  any  potential  software  component: 

(a)  name  /  type 

(b)  description 

(c)  requirement  specification  fragment 

(d)  design  structure 

(e)  design  rationale 

(0  int»face  specifications  and  dependencies 

(g)  program  design  language  text 

(h)  implementation 

(i)  configuration  and  version  data 

0)  test  cases 

(k)  metric  data 

(l)  access  rights 

(m)  search  points 

(n)  catalog  information 

(o)  library  and  architecture  links 
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(p)  hypertext  paths 

(q)  models 

(r)  constraints 

Analysis  of  problem  domain  attributes  is  essential 
for  understatKling  and  reHning  solutions  which  realize  new 
capacities,  improve  non-functional  qualities,  refine 
existing  functions,  create  interfaces  to  new  or  changed 
external  contexts,  etc.  The  nature  of  problem  domains  has 
been  investigated  using  several  techniques,  many  of  which 
are  characterized  as  "domain  analysis".  Wartik  and  Prieto- 
Diaz  catalog  and  compare  flve  differing  approaches  to 
domain  analysis  [62].  Brief  synopses  and  references  are 
provided  below.  Each  domain  analysis  method  offers  a 
slightly  different  perqiective  on  uncovering  user  needs  and 
understanding  the  legacy  upon  which  evolutionary 
reengineering  must  be  based. 

Prieto-Diaz'  own  analysis  method  is  a  hybrid  of 
problem  and  solution  oriented  approaches.  He  supports 
bottom-up  analysis  with  a  classification  sqiproach  and  top- 
down  activities  with  systems  analysis,  and  aims  to  provide 
artifacts  which  can  be  reused.  These  range  from  problem 


features  to  solution  mechanisms. 

The  FODA  method  of  domain  analysis  and  the 
Synthesis  method  follow  a  top  down,  problem  oriented 
approach  to  analysis  of  a  domain  [43,14].  Both  of  these 
domain  analysis  methods  focus  on  invariable,  unique,  and 
commonly  used  features  of  elements  of  the  domain.  The 
Synthesis  technique  is  aimed  at  definition  of  a  process 
model  for  application  development,  while  the  FODA 
method  is  focused  on  user  decisions  and  views  which  can 
become  attributes  of  product  architecture. 

The  KAPTUR  method  [6]  for  domain  analysis 
and  Lubars's  method  (62]  both  provide  domain  models 
which  can  be  more  readily  uansformed  by  domain 
engineers  into  implementations.  Both  of  these  methods 
apply  to  the  stepwise  process  of  assessing  legacy  as  well 
as  to  the  analysis  of  problem  domain  attributes. 

Another  domain  analysis  approach  was 
developed  for  the  Joint  Integrated  Avionics  Working 
Group,  and  applies  an  analysis  technique  which  is  based 
upon  Coad  and  Yourdon  techniques  [27,15].  This 
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technique  is  also  implementation  oriented,  and  produces 
detail  which  facilitates  generation  of  reusable  artifacts. 

IS  Use  of  Information  Architectures 

Gelemter's  vision  ("shadow  programs"  which 
mirror  the  operation  of  existing  systems  so  as  to  facilitate 
their  evolution  [23])  points  to  the  goals  of  recent  research 
reported  by  Shaw  [SI]  in  which  he  investigates  formal 
methods  and  mechanisms  for  executable,  universal, 
formal,  and  scalable  specifications.  In  specialized 
instances  [SS],  some  success  has  been  reported  with 
formal  software  development  tools  for  automated 
transformational  development.  However,  in  the 
mainstream,  information  architectures  must  be  developed 
by  careful  analysis  of  legacy  systems  and  problem 
features.  A  process  ("Implementing  Model  Based 
Software  Engineering  (MBSE)")  is  described  by  Withey 
which  covers  this  more  general  range  of  applications  [63]. 
MBSE  consists  of  two  parallel  processes.  One  applies  to 
domain  engineering  •  the  process  for  creating  software 
models  and  other  core  assets;  the  other  applies  to 
iq)plication  engineering  -  the  process  for  using  models  in 
the  construction  of  software  systems.  When  software 
architectures  are  sufficiently  mature  in  terms  of  their 
formal  representations  and  in  terms  of  the  generality  of 
their  abstractions,  then  it  will  be  possible  to  automatically 
generate  components  of  applications.  Presently,  the 
MBSE  process  is  largely  manual.  Its  effectiveness  is 
dependent  upon  careful  use  of  modeling  and 
representational  formalisms  and  upon  an  information 
architecture  to  guide  its  convergence  on  long  term  user 
needs. 

An  information  framework,  within  which  design, 
behavioral,  and  engineering  attributes  of  a  system  are 
collected  and  stored,  should  be  structured  to  facilitate  each 
level  of  synthesis  and  analysis  that  is  needed  to  meet 
reengineering  objectives.  It  must  preserve  views  of  legacy 
mechanisms  and  problem  domain  features  which  are 
significant  to  successful  reengineering.  Provision  for 
deHning  design  objects  is  needed  within  each  of  these 
views. 

Figure  7  « epicts  an  information  architecture  as 
such  a  framework,  and  separates  the  related  but  distinctly 
different  concepts  of  systems  and  software  architecture. 

In  an  evolutionary  reengineering  process, 
abstractions  of  system  properties  and  component  relations 
provide  a  basis  for  reasoning  about  the  system.  From  this 
reasoning,  both  legacy  details  and  domain  knowledge  can 
be  used  to  create  a  system  architecture.  In  the  subsequent 


increments  of  the  process  for  evolving  new  capabilities, 
allocations  of  requirements  and  derivations  of 
specifications  are  revisited  after  evaluation  of  trial 
implementations  that  prototype  significant  new  parts. 

When  uadeoff  analyses  are  posed  for  design 
alternatives,  independent  technical  factors  must  be  scored 
to  make  balanced  decisions.  On  the  other  hand,  when 
quality  is  assessed,  correlated  assessment  perspectives  lead 
to  quality  generalizations.  The  domain  insight  and  legacy 
detail  within  an  information  architecture  lead  to  this 
general  balance. 

When  boundary  conditions  and  limits  are  known, 
evaluators  can  be  assured  of  covering  all  cases  of  interest 
completely.  Similarly  the  precedence  of  required  feauires 
and  the  priority  for  behavioral  options  are  derived  from 
system  context  detail.  These  in  turn  derive  from  user 
consensus  on  domain  features,  which  is  facilitated  by 
elaboration  of  an  infonnation  architecture. 

In  the  object-oriented  community,  a  kind  of  meta- 
information  architecbue  exists  in  the  Object  Management 
Architecture  (OMA)  [57].  The  OMA  Guide  provides  a 
general  and  an  abstract  framework  for  object-oriented 
systems  that  outlines  a  single  terminology,  technical  goals 
(engineering  process)  and  architectural  goals  (product 
feature),  and  provides  a  reference  model  for  integrating 
distributed  applications  using  object-oriented  techniques. 

8  Impact  of  An  Information  Architecture 

Impact  of  an  information  architecture  is  seen  in 
the  realization  of  systems  performance  and  capability  by 
means  of  an  efficient  and  cost-effective  engineering 
process,  which  provides  required  levels  of  functional  and 
non-functional  quality. 

An  information  architecture  adds  value  to  the 
evolutionary  reengineering  process  by  : 

•  Illuminating  with  abstraction,  e.g.:  functions, 
entity  relationships,  objects,  object  classes,  processing 
states,  data  events,  operational  modes; 

•  Decorrelating  relationships  to  enable  balanced 
tradeoffs,  e.g.:  operational  flexibility  versus  efficiency,  or 
modifiability  versus  seamless  integration; 

•  Correlating  perspectives  to  enable  qualitative 
assessment,  e.g.:  processing  accessibility  with 
availability,  or  data  persistence  with  redundancy; 

•  Framing  boundaries  to  enable  thorough 
functional  evaluation,  e.g.:  event  scenarios,  process 
threads,  file  schema; 
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•  Providing  context  for  setting  precedence  and 
prioritization,  e.g.;  operator  scenarios,  metadata, 
theoretical  models. 

Information  architecture  has  a  positive  impact  on 
acceptability,  optimization,  efficiency,  and  return  on 
investment  ^OI)  of  evolutionary  reengineering  efforts.  It 
adds  value  to  several  aspects  of  an  evolutionary  system: 
These  are  its; 

•  capability 

•  performance 

•  engineering  process 

•  overall  quality 

The  object-oriented  aspects  of  an  Information 
Architecture  offer  the  opportunity  to  assess  the  quality  of  a 
proposed  reengineering  effort  according  to  the  techniques 
proposed  by  Chidamber  and  Kemerer  [13].  While  these 
techniques  are  aimed  at  the  evaluation  of  object-oriented 
design,  they  clearly  apply  to  the  envelope  of  object 
oriented  design  alternatives  posed  within  an  Information 
Architecture.  Implications  from  the  use  of  these  meuics 
can  include 


one  hand  and  simplicity  and  ease  of  understanding  on  the 
other); 

•  identifications  of  opportunity  for  optimizing  the 
allocation  of  testing  resources  (e.g.:  by  identifying  classes 
of  error-prone  interfaces);  and 

•  assessments  of  the  viability  of  the  revised  class 
structure  proposed  for  the  system  (e.g.:  in  terms  of 
numbers  of  object  classes  at  the  root  level,  and 
interconnections  between  various  parts  of  an  application). 

To  forecast  the  impact  of  an  evolutionary 
reengineering  effort,  qualitative  and  quantitative  meuics 
are  necessary.  Kazman,  Bass  and  others  have 
experimentally  applied  methods  for  evaluation  of 
architectures  [32].  Their  approach  a|q)lies  a  life  cycle 
perqiective  (which  considers  engineering  process  aqiecu 
of  an  information  architecture).  It  also  relies  upon  a 
common  representational  form  to  surface  a  common 
understanding  (which  permits  comparison  of  product 
attributes  contained  in  an  information  architecture). 
Although  present  work  has  been  limited  to  user  interface 
architectures,  it  permits  comparison  and  ranking  of 
software  architectures,  and  the  approach  promises  to 
extend  to  other  software  architectures. 


•  indications  of  design  tradeoff  opportunities 
(e.g.:  between  inheritance  and  related  reusability  on  the 


In  their  analysis  of  non-functional  factors  in  the 
quality  of  large  systems,  Salasin  and  Waugh  consmict  a 
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chain  of  interlocking  commitments  and  obligations  to  map 
system  level  quality  factors  into  indicators  of  non¬ 
functional  quality  [46.48].  Their  technique  permits 
assessment  of  information  architectural  attributes  so  as  to 
forecast  such  system  qualities  as  testability  and 
survivability.  It  enables  an  information  architecture  to 
become  a  source  of  primary  evidence  for  on-going 
examination  of  non-functional  quality  characteristics  of  an 
evolving  system.  The  approach  uses  several  categories  of 
reengineering  scenario  to  stimulate  analysis.  The 
categories  are  platform  changes  (e.g.:  processors,  displays, 
software  subsystems);  perfwmance  improvements  (e.g.; 
response,  latency,  speed,  capacity,  throughput,  accuracy, 
precision);  extensions  of  functionality  (e.g.:  updated 
constraints  and  operating  parameters):  changes  to 
capabilities  (e.g.:  new  missions);  quality  improvements 
(e.g.:  modifiability);  and  new  external  interfaces  (e.g.: 
swapped  out  functionality). 

In  general  two  kinds  of  factors  indicate  the 
quality  of  an  information  architecture:  its  intrinsic 
leverage  is  a  function  of  content  and  context  factors;  its 
extrinsic  leverage  flows  from  its  adequacy,  usefulness  and 
usability.  Figure  8  relates  several  fiuxts  of  an  information 
architecture  to  four  aspects  of  its  quality. 

Information  architecture  qualifies  both  behavior 
and  design  characteristics,  and  identifies  necessary 
characteristics  of  the  engineering  process  environment. 
As  a  framework  for  evolutionary  reengineering,  the  value 
of  an  information  architecture  is  indicated  in  its 

(a)  high-level,  abstract  models  of  the  target 
system  from  all  perspectives  that  add  to  developer  and 
user  insight  (at  least  dynamic,  functional,  and  structural). 

(b)  boundaries  and  definitions  for  both  operating 
and  engineering  environments. 


User-functional  capabilities  that  aptly  typify  an 
evolving  system  are  uncovered  by  comprehensive  top- 
down  analysis  of  its  problem  domain.  This  analysis  also 
helps  decide  which  models  are  the  best  platforms  for 
evaluating  prototypes  of  these  capabilities. 

Robust  objects  whose  value  has  been  proven  in 
legacy  systems  are  the  result  of  continuing,  bottom-up 
synthesis  of  previously  successful  products  and  systems. 
Layered  organizations  of  proven  objects  enables  rapid 
realization  of  platforms  for  evolving  and  evaluating 
prototypes  of  subsystems  elements  that  have  potential 
operational  value. 

Several  conclusions  about  evolutionary 
reengineering  of  large  scale  software  systems  follow; 

•  Converging  evolution  assures  operational 
success  by  realizing  user  goals  incrementally  and  reducing 
risk  of  cost  or  schedule  overruns  and  technical  shortfalls. 

•  Layered  object  organization  facilitates  evolution 
by  enabling  efficient  problem-oriented  exploration  and 
product-domain  tailoring. 

•  Information  architecture  leads  to  converging 
evolution.  Systems  and  software  architectures  frame  the 
target  for  evolution,  and  information  architecture  frames 
the  engineering  activity. 

•  Analysis  of  problem  domains  is  essential  to 
uncover  typical  functions  for  evolutionary  prototypes,  and 
best  models  for  evolving  and  evaluating  prototype 
capabilities. 

•  Synthesis  from  legacy  systems  speeds  definition 
of  robust  implementation  objects  from  products  that  have 
had  proven  success,  and  realistic  platforms  for  evaluating 
prototypes  /  initial  operations. 

10  Summary  Observation 


9  Conclusion 

Likelihood  of  satisfactory  operations  of  large 
scale  systems  increases  with  evolution  by  virtue  of 
increased  iteration  to  refine  implementation  aspects  that 
meet  long-term  user  needs. 

Objects  whose  scope  ranges  from  infrastructural 
mechanisms  to  problem  domain  artifacts  guide  builders 
and  users  toward  efficient  realization  of  systems 
components.  Architectures  provide  structure  and  form  for 
mediating  this  organization  and  understanding  so  as  to 
meet  constraints  of  builders  and  needs  of  users. 


An  analogy  highlights  the  above  conclusions. 
Evolutionary  reengineering  using  an  information 
architecture  can  be  viewed  as  a  problem  of  stabilizing  the 
dynamic  behavior  of  a  software  system.  Linear  systems 
models  are  used  to  approximate  control  solutions  for  such 
problems.  Solutions  are  obtained  via  an  iterative  process 
of  integration  in  which  vector  representations  and  matrix 
organization  of  the  known  information  about  the  problems 
enable  generation  of  parameters  of  the  new  stabilized 
system  state.  In  evolutionary  reengineering  using  an 
information  architecture,  the  iterative  process  of 
integration  can  be  viewed  as  a  series  of  incremental 
changes  to  functionality  and  infrastructure  through  which 
new  states  of  behavior  and  capability  are  evolved  for  the 
software  system. 
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In  this  analogy,  the  system  behavior  of  a  software 
system  and  the  nature  of  its  evolution  relate  to  the 
b^vior  and  nature  of  a  dynamic  system.  Information 
representing  the  resulting  state  of  software  capability  is 
analogous  to  the  state  vector  that  represents  the  stabilized 
system  state  in  the  dynamics  model.  The  driving  force  of 
user  needs  is  comparable  to  the  random  noise  vector  that 
perturbs  a  dynamic  system  to  cause  a  change  in  its  state. 
Controls  provided  via  an  architecture-based  evolutionary 
process  derive  from  problem  domain  analysis  and 
synthesis  of  artifacts  from  legacy  systems.  These  are 
analogous  to  the  control  vector  which  constrains  the 
stabilization  of  the  linear  dynamics  system. 

Stabilization  of  a  dynamics  model  is  dependent 
upon  integrating  the  information  content  of  these 
fundamental  vector  representations  with  a  matrix  of 
information  that  represents  the  changing  system  state. 
This  matrix  is  analogous  to  an  information  architecture. 
Just  as  determining  the  precise  form  and  content  of  a 
system  state  matrix  is  essential  for  developing  a  solution  to 
stabilize  a  mechanical  linear  dynamic  system,  elaborating 
the  content  and  understanding  the  necessary  form  of  a 
matrix  of  information  architectural  information  are 
essential  to  transforming  a  large-scale  software  system  via 
an  evolutionary  process  into  a  system  with  significantly 
enhanced,  extended,  or  adapted  capabilities.  Controlling 
the  convergence  (subilizing  stages)  of  evolutionary 
reengineering  requires  that  both  the  guiding  architectural 
vision  and  the  evolutionary  process  be  represented  in  the 
same  context  and  form. 

An  information  architecture-based  approach  to 
reengineering  provides  a  pattern  for  representing  the 
evolutionary  process  itself,  the  control  inputs  provided  by 
domain  analysis  and  synthesis  of  legacy,  and  the  stimuli 
provided  by  new  user  requirements. 

In  different  terms,  this  analogy  expresses  the 
insight  presented  by  Srinivas  and  Smith  in  their  recent 
short  paper  on  property  preserving  transformation  of 
programs  [S8].  When  combined  with  Geiemter's  notion  of 
"shadow  programs",  a  path  is  clearly  evident  for  realizing 
system  features  which  support  self-sustained  evolution . 
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Abstract 

The  hnanciai  pressure  to  meet  the  need  for  change 
in  complex  systems  through  evolution  rather  than 
through  revolution  has  spawned  the  discipline  of 
reenfineering.  One  driving  factor  of  reengineering 
is  that  it  is  increasingly  becoming  the  case  that  en¬ 
hanced  requirements  placed  on  complex  systems  are 
overstressin^  the  processing  resources  of  the  systems. 
Thus,  the  distribution  of  processing  load  over  highly 
parallel  and  distributed  hardware  architectures  is  be¬ 
ing  explored  as  part  of  the  reengineering  process. 
Existing  complex  systems  were  developed  originally 
to  exploit  small  scale  concurrency  in  programming 
paradigms  that  support  little  or  no  expression  of  con¬ 
current  execution.  Therefore,  several  difficult  tasks 
must  be  accomplished  to  reverse  engineer,  transform 
and  restructure  systems  so  that  they  exploit  signif¬ 
icantly  increased  amounts  of  concurrency.  In  this 
paper  we  present  metrics  for  capturing  features  of 
complex  systems  needed  for  a  transformation  ap¬ 
proach  for  enhancing  concurrency.  The  metrics  not 
only  capture  systems’  features  necessary  for  concur¬ 
rency  analysis,  but  also  are  independent  of  any  pro¬ 
gramming  language,  operating  system  or  hardware 
architecture.  Using  the  metrics,  we  define  an  ap¬ 
proach  for  transforming  complex  systems  that  ex¬ 
ploit  modest  amounts  of  concurrency  into  systems 
that  utilize  large  scale  concurrency.  The  approach 
includes  the  aggregation  of  concurrency  information 
across  levels  ofabstraction  hierarchies  to  enable  con- 
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currency  analysis  at  varying  degrees  of  granularity, 
ranging  from  the  statement  level,  to  the  package  in¬ 
stance  level,  to  the  level  of  federated  systems.  In  con¬ 
junction  with  concurrency  enhancement  for  the  ac¬ 
commodation  of  enhanced  requirements,  our  reengi 
neering  approach  also  employs  software  component 
layering  and  reuse  to  reduce  the  costs  of  design,  im¬ 
plementation,  testing,  verification,  and  maintenance. 

1  Introduction 

A  complex  system  has  many  characteristics,  in¬ 
cluding  performance,  timeliness,  availability,  de¬ 
pendability,  safety  and  security.  Furthermore,  such 
a  system  typically  performs  many  related  functions 
concurrently,  interacts  with  the  environment  and 
many  human  operators  and/or  clients  simultane¬ 
ously,  consists  of  many  interconnected  processing  el¬ 
ements,  contains  many  millions  or  tens  of  millions  of 
lines  of  code,  takes  years  to  develop  from  first  con¬ 
cept  formulation  to  final  deployment,  and  has  devel¬ 
opment  costs  of  many  tens  or  hundreds  of  millions  of 
dollars. 

Complex  systems  generally  address  nontransient 
requirements  that  simply  cannot  be  addressed  with 
simpler  solutions.  Thus  they  tend  to  be  character¬ 
ized  by  long  life  cycles,  often  spanning  decades.  Dur¬ 
ing  such  extended  life  cycles,  change  is  inevitable 
in  many  dimensions;  operational  environment,  sys¬ 
tem  requirements,  technology  base,  etc.  Because  of 
the  time  and  cost  of  development  of  complex  sys¬ 
tems,  and  because  of  the  infrastructure  needed  for 
their  development  and  continued  support  o*  c  de¬ 
ployed,  infrastructure  which  includes  highly  rained 
personnel,  hardware  and  support  tools,  documenta¬ 
tion,  test  procedures,  and  many  other  components, 
there  is  enormous  financial  pressure  to  meet  the  need 
for  change  through  evolution  rather  than  revolution. 

This  need  hea  spawned  the  discipline  of  reengi¬ 
neering,  the  systematic  application  of  methodology 
and  tools  to  managing  the  evolutionary  transforma¬ 
tion  of  existing  complex  systems  to  encompass  new 
or  altered  requirements  and  to  transport  such  sy.s- 
tems  into  new  environments  and  onto  new  technol¬ 
ogy  bases. 
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While  reengineering  holds  the  promise  of  in¬ 
creased  efficiency  in  dealing  with,  and  managing 
change  in,  large,  complex  systems,  it  is  not,  and 
probably  can  never  be,  a  panacea.  The  act  of  chang¬ 
ing  (rather  than  rebuilding)  a  complex  system  in¬ 
evitably  introduces  something  akin  to  entropy  into  a 
well  ordered  system.  The  accumulation  of  disorder 
can  be  minimized  to  some  degree  by  careful  choice  of 
original  design  and  by  advancing  and  fully  exploit¬ 
ing  reengineering  technology.  In  particular,  the  cur¬ 
rent  trend  toward  so  called  open  system  designs  is  a 
de  facto  recognition  of  the  inevitability  and  cost  of 
change. 

It  should  be  added  that  because  of  the  multicom¬ 
ponent  nature  of  complex  systems,  it  is  often  the 
case  that  old  components  must  coexist  harmoniously 
with  new  components  in  a  mature  complex  system. 
Thus,  forward  engineering  of  complex  system  com¬ 
ponents  may  occur  simultaneously  with  reengineer¬ 
ing  of  other  components.  Reengineering,  along  with 
open  system  design,  may  properly  be  viewed  as  avail¬ 
able  techniques  for  optimizing  the  resources  required 
to  extend  the  life  cycles  of  complex  systems. 

It  is  increasingly  becoming  the  case  that  the  in¬ 
creased  requirements  placed  on  complex  systems  are 
overstressing  the  processing  resources  of  the  systems. 
Thus,  implementations  that  exploit  highly  paral¬ 
lel  and  distributed  hardware  are  being  developed. 
While  complex  systems  of  the  previous  generation 
employed  some  parallel  processing,  they  typically 
used  on  the  order  of  twenty-five  processors.  Func¬ 
tionality  enhancements  in  modern  complex  systems 
may  need  more  than  one-thousand  processors  Since 
complex  systems  were  developed  for  small  scale  par¬ 
allelism  in  programming  paradigms  that  supported 
little  or  no  expression  of  parallelism,  the  exploita¬ 
tion  of  the  tremendously  increased  parallelism  is  a 
challenge  that  must  be  addressed. 

In  conjunction  with  concurrency  enhancement 
for  the  accommodation  of  enhanced  requirements, 
reengineering  should  also  consider  the  use  of  modern 
software  engineering  principles  to  reduce  the  costs 
of  design,  implementation,  testing,  verification,  and 
maintenance.  Layering  of  software  components  is 
one  technique  that  addresses  these  concerns.  When 
a  system  is  constructed  by  layering,  the  benefits 
include  encapsulation  and  information  hiding  (i.e., 
lo^  ie  coupling  of  software  components)  [6],  abstrac¬ 
tion  (highly  cohesive  modules),  ease  of  understand- 
ability  and  simplification  of  analyses  for  concur¬ 
rency  [12,  14,  9],  timing  properties  [12,  10],  depend¬ 
ability  [21  and  security.  The  reuse  of  software  com¬ 
ponents  during  system  implementation  and  reimple¬ 
mentation  is  another  technique  that  addresses  the 
aforementioned  concerns.  When  previously  engi¬ 
neered  and  validated  components  are  reused,  the 
elapsed  time  from  the  initial  phases  of  reengineer¬ 
ing  until  system  deployment  can  be  reduced  signifi¬ 
cantly.  Fortunately,  modern  programming  languages 
provide  constructs  which  enable  software  to  be  im¬ 
plemented  by  layering  components  and  by  reuse.  For 
example,  Ada  provides  the  generic  package,  which 


can  be  used  to  implement  abstract  data  type  mod¬ 
ules  whicii  are  parameterized  by  types  and  opera¬ 
tions.  Similarly,  C-f— )-  allows  the  definition  of  generic 
class  templates,  which  can  be  instantiated  with  type 
and  operation  parameters  to  obtain  abstract  data  ob¬ 
jects.  The  effective  use  of  such  language  constructs 
should  be  considered  during  reengineering. 

We  have  developed  an  automated  reengineering 
framework  that  considers  concurrency  as  well  as  lay¬ 
ering  and  reuse  of  software  (see  Figure  1).  To  reengi¬ 
neer  a  system  so  that  the  resulting  system  has  en¬ 
hanced  concurrency  and  is  implemented  using  mod¬ 
ern  software  engineering  principles,  it  is  necessary 
first  to  reverse  engineer  the  existing  system.  That  is, 
it  is  necessary  to  capture  in  an  intermediate  represen¬ 
tation  (IR)  the  syntactic  and  semantic  attributes  and 
interrelationships  of  the  current  system’s  software, 
hardware,  and  humanware  (e.g.,  radar  operator  or 
other  tactical  operators,  manual  entry  of  key  to  per¬ 
form  secure  operations).  Using  the  IR,  concurrency 
metrics  are  extracted.  After  capturing  the  syntac¬ 
tic  and  semantic  aspects,  as  well  as  the  concurrency 
metrics  of  the  system  to  be  reengineered,  redesign 
(transformation)  and  reimplementation  (translation) 
of  the  system  are  performed.  Redesign  (or  design 
transformation)  performs  a  restructuring  of  the  sys¬ 
tem  by  formulating  abstractions  and  exposing  con¬ 
currency,  while  considering  serializability  and  dead¬ 
lock.  The  new  system  design  is  translated  into  pro¬ 
grams  in  one  or  more  target  languages.  The  gen¬ 
eration  of  code  from  a  design  involves  the  defini¬ 
tion  of  packages/templates/classes  to  implement  the 
design  abstractions  (employing  genericity,  reuse,  en¬ 
capsulation,  and  information  hiding),  the  definition 
of  data  structures  (using  types  exported  by,  and/or 
objects  encapsulated  within,  package/template/class 
instances),  and  the  retrieval  and  reuse  of  previously 
implemented  components  that  are  stored  in  a  soft¬ 
ware  repository.  Given  the  complete  collection  of 
software  components,  they  are  clustered/partitioned 
and  assigned  to  a  parallel  and/or  distributed  hard¬ 
ware  platform  in  a  manner  that  allows  effective  uti¬ 
lization  of  concurrency  and  that  also  allows  compli¬ 
ance  with  timing,  dependability,  security  and  other 
constraints. 

The  work  described  in  this  manuscript  is  inspired 
by  both  the  AEGIS  and  the  HIPER-D  projects, 
which  provided  the  impetus  to  consider  the  problems 
addressed  and  which  are  among  the  potential  recip¬ 
ients  of  the  research  results.  AEGIS  [4,  5]  is  a  com¬ 
plex  system  engineered  to  protect  a  fleet  of  ships  from 
subsonic  and  supersonic  threats  such  as  manned  air¬ 
craft,  air-to-surface  missiles,  surface-to-surface  mis¬ 
siles,  and  undersea  missiles.  The  requirements  of  the 
system  include  fast  reaction  (instant  response  to  tar¬ 
gets  in  specific  sectors  that  match  particular  patterns 
with  respect  to  speed,  course  and  altitude),  accuracy, 
resilience  to  faults  and  to  overloads,  and  security.  For 
example,  AEGIS  may  detect  a  sea-skimming  missile 
at  a  distance  of  15  miles  and  traveling  at  Mach  1.  To 
avoid  loss  of  lives  and  equipment  due  to  impact  of  the 
missile,  the  detection,  classification,  tracking,  assign¬ 
ment  of  threat  priority,  and  firing  of  a  counter-missile 


Figure  1:  Reengineering  framework  for  concur¬ 
rency  enhancement,  software  component  layering 
and  reuse. 


must  occur  within  a  very  stringent  deadline  [3].  The 
other  motivator  of  this  work,  the  HIPER-D  project, 
is  an  ARPA  sponsored  project  that  is  exploring  the 
use  of  high-performance  distributed  (and  parallel) 
computer  systems  as  platforms  for  complex  systems. 

The  remainder  of  the  paper  is  organized  as  fol¬ 
lows.  Section  2  illustrates  typical  software,  hard¬ 
ware  and  operating  system  characteristics  of  Navy 
systems  for  which  reengineering  is  being  considered 
presently.  In  Section  3,  it  is  shown  how  contem¬ 
porary  programming  languages,  operating  systems 
and  hardware  platforms  can  be  used  as  targets  for 
a  reengineering  process  that  enhances  concurrency, 
produces  layered  software  components  and  applies 
software  reuse.  In  Section  4  we  discuss  an  intermedi¬ 
ate  representation  and  metrics  that  capture  the  rele¬ 
vant  attributes  of  complex  systems  in  a  manner  that 
is  independent  of  any  programming  language,  hard¬ 
ware  architecture,  or  operating  system.  An  overview 
of  our  methodology  for  using  our  IR  and  metrics  to 
transform  complex  systems  that  have  minimal  soft¬ 
ware  layering  and  small-scale  parallelism  into  sys¬ 
tems  with  extensive  layering  of  software  components 
and  large-scale  parallelism  is  also  presented  in  Sec¬ 
tion  4.  The  methodology  is  based  in  part  on  a  model 
of  concurrent  execution  that  we  have  defined  called 
asynchronous  remote  procedure  call  (A RPC)  [15], 
which  allows  concurrency  in  amounts  proportional 
to  the  amount  of  layering  in  application  software. 
Since  parallelism  is  not  the  only  concern  in  complex 
systems,  our  r.  ethodology  also  considers  timeliness, 
dependability  and  security. 

2  Typical  Characteristics  of  Systems 
to  be  Reengineered 

Complex  Navy  systems  built  more  than  one 
decade  ago  exhibit  certain  trends,  reflecting  the  pre¬ 
vious  state-of-the-art  in  the  areas  of  software,  hard¬ 
ware,  and  operating  system  technology.  In  this  Sec¬ 
tion  we  examine  the  past  trends  in  each  of  these  ar¬ 
eas,  with  an  emphasis  on  characterizing  aspects  of 
such  systems  in  paradigm-independent  fashions. 

A  typical  software  paradigm  observed  during  re¬ 
verse  engineering  is  one  in  which  procedures  are  com¬ 
bined  to  form  a  module  (see  Figure  2).  A  module 
may  contain  exported  operations  (callable  from  with¬ 
out  the  module)  and  internal  operations  (callable 
only  from  within  the  module).  In  addition  to  con¬ 
taining  operations,  each  moaule  may  contain  data 
accessible  only  by  its  operations.  Each  exported  op¬ 
eration  of  a  module  is  termed  a  module  entry,  and 
serves  as  a  means  of  manipulating  the  module’s  in¬ 
ternal  state.  A  module  may  contain  at  most  one 
of  each  of  several  kinds  of  module  entries,  of  which 
the  following  are  representative.  An  initialization 
entry  is  scheduled  when  the  tactical  system  is  be¬ 
ing  initialized  or  when  a  disabled  module  is  enabled. 
A  message  entry  is  scheduled  to  accept  and  process 
messages  from  other  modules.  An  error  entrance  is 
scheduled  when  an  error  condition  is  detected.  A 
successor  entrance  is  used  for  any  general  purpose, 
nonperiodic  task.  Both  a  buffer  complete  entrance 
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and  a  channel  complete  enlranc(;  are  scheduled  in  re¬ 
sponse  to  user-controlled  directions  associated  with  a 
completed  I/O  process.  A  periodic  entrance  executes 
functions  that  must  be  performed  at  regular  inter¬ 
vals.  An  important  consideration  during  reengineer¬ 
ing  is  that  assembly  language  appears  frequently  in 
the  code  bodies  of  entries,  since  module  development 
languages  (such  as  CMS-2)  provide  no  easy  way  to 
control  hardware  device  accesses.  In  addition  to  the 
user-defined  modules,  a  system  may  contain  global 
data  (tables)  that  are  accessible  by  the  operations 
of  any  module.  There  is  also  a  set  of  common  ser¬ 
vice  routines  (CSRs)  and  executive  service  requests 
(ESRs),  callable  from  any  operation. 

In  addition  to  functionality,  modules  exhibit  other 
'  I  j  characteristics.  Concurrency  and  timing  properties 

-  iDO(hlel(oamtBroalooacnimKy)  are  stated  by  defining  periodic  module  entries.  Each 

of  these  executes  once  per  period  and  may  have  a 
_  \  deadline  by  which  any  particular  execution  of  the 


Figure  2:  A  typical  model  of  previous  generation 
software. 


entry  must  complete.  At  most  one  entry  may  be  ac¬ 
tive  within  a  module  at  any  time  (i.e.,  modules  are 
monitors).  A  complex  system  is  composed  of  many 
independent  activities  (or  threads  of  control),  which 
are  implemented  via  calls  to  module  entries,  ESRs 
and  CSRs,  and  which  may  directly  access  global  ta¬ 
bles.  Due  to  the  lack  of  layering,  all  modules,  tables, 
CSRs  and  ESRs  are  visible  to  each  activity.  Complex 
systems  must  adapt  to  times  of  overload  by  shed¬ 
ding  less  critical  tasks  in  favor  of  more  critical  ones. 
This  shedding  is  termed  throttling,  and  is  specified 
by  associating  a  criticality  with  each  module  entry. 
Throttling  allows  some  degree  of  adaptability,  but 
the  system  must  also  be  able  to  function  correctly 
in  the  presence  of  hardware  faults.  This  is  normally 
stated  implicitly  by  designing  redundancy  into  the 
hardware. 

In  addition  to  concurrency,  it  is  also  necessary  to 
obey  various  security  levels.  A  module  entry  or  a 
piece  of  data  may  have  access  restrictions,  only  per¬ 
mitting  users  having  the  appropriate  security  level, 
password,  or  key  to  use  them.  Security  requirements 
cannot  be  stated  in  the  programming  paradigm,  but 
are  implicitly  coded  into  systems. 

Fault  tolerance  is  provided  in  the  previous  genera¬ 
tion  of  complex  systems  by  replicating  CPUs,  memo¬ 
ries,  and  interprocessor  communication  links.  A  typ¬ 
ical  hardware  system  employs  fewer  than  10  nodes, 
each  consisting  of  a  few  CPUs  with  private  memo¬ 
ries,  and  a  shared  node  memory  (see  Figure  3).  The 
execution  paradigm  is  usually  MIMD.  An  approach 
frequently  used  to  tolerate  faults  is  to  maintain  dual 
memories,  one  containing  the  current  values  of  all 
data  2md  the  other  keeping  current  values  of  critical 
data.  In  the  event  of  a  failure  of  the  first  memory,  the 
system  automatically  switches  to  the  second  mem¬ 
ory.  Both  memories  also  contain  complete  copies  of 
the  code  assigned  to  the  node.  Normally,  both  CPUs 
execute  the  instructions.  Each  instruction  is  stati¬ 
cally  tagged  with  the  CPU  which  is  to  execute  it, 
and  the  instruction  fetching  mechanism  insures  that 
each  instruction  is  executed  by  the  correct  CPU.  To 
perform  a  “hot  restart”  when  a  node  becomes  over- 
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Figure  3:  A  typical  model  of  previous  generation 
hardware. 


loaded,  execution  switches  to  the  copy  of  code  in  the 
backup  memory.  In  that  way,  all  noncritical  data  are 
reset  to  their  initial  values.  Additionally,  a  backup 
node  is  installed.  If  one  node  crashes,  its  software 
processes  are  restarted  on  the  backup  node.  For  se¬ 
curity,  there  are  hardware  mechanisms  (such  as  keys) 
that  must  be  switched  on  in  order  to  execute  secure 
functions. 

The  execution  model  used  in  the  previous  gen¬ 
eration  of  systems  typically  relies  on  an  executive 
that  provides  features  to  handle  such  things  as  hard¬ 
ware  interrupts,  memory  management,  application 
module  scheduling,  and  fault  tolerance.  The  applica¬ 
tion  programs  use  services  provided  by  the  executive 
for  module  (re)initialization,  intermodule  and  inter¬ 
computer  communication,  scheduling  of  other  mod¬ 
ules,  error  processing,  input/output  channel  commu¬ 
nications,  and  periodic  scheduling.  These  systems 
are  almost  always  non-portable  and  in  most  cases 
are  coded  in  high-level  languages  and  assembly  lan¬ 
guages  available  only  on  a  specific  class  of  hardware. 
The  Aegis  Tactical  Executive  System,  ATES/43,  is 
such  an  executive,  as  described  in  the  following  quote 
from  [1]; 

The  ATES/43  is  an  interrupt-driven  and 
table-directed  executive  system  designed  to 
meet  a  broad  range  of  requirements  speci¬ 
fied  for  AEGIS  DDG  combat  system  com¬ 
puter  programs  that  run  in  AN/UYK-43 
computers.  It  responds  to  all  AN/UYK-43 


interrupts  on  a  real-time  basis,  determines 
the  processing  to  be  performed,  and  per¬ 
forms  the  processing  or  passes  control  of  a 
CPU  either  to  firmware  module  Fault  Tol¬ 
erant  System  Reconfiguration  Module  or  a 
scheduled  software  module  for  processing. 
ATES/43  supports  user-controlled  rapid  re¬ 
covery  and  online  reconfiguration,  and  pro¬ 
vides  flexibility  for  the  design  of  diverse 
computers  programs. 

The  executive  restricts  the  execution  to  one  en¬ 
trance  per  module  at  any  instant.  In  other  words, 
only  one  entrance  of  a  module  may  be  either  exe¬ 
cuting  or  in  the  execution  queue  at  any  time.  A 
secondary  queue  is  provided  to  handle  this  restric¬ 
tion.  The  user  can  also  specify  the  CPU  on  which  a 
module  entrance  is  to  execute. 

There  are  several  reasons  for  migrating  from  the 
aforementioned  paradigm.  As  the  functionality  of 
complex  systems  increases,  it  is  desirable  to  increase 
the  concurrency  in  order  to  meet  the  timing  require¬ 
ments.  Thus,  systems  should  be  portable  to  differ¬ 
ent  hardware  platforms.  The  frequent  use  of  assem¬ 
bly  language  to  implement  entries  significantly  de¬ 
creases  portability.  Furthermore,  the  lack  of  user- 
defined  concurrency  in  system  designs  makes  it  dif¬ 
ficult  to  exploit  a  large  parallel  processor,  and  the 
use  of  programming  constructs  like  pointers  makes 
the  automatic  analysis  of  parallelism  troublesome. 
Also,  the  flat  module  hierarchy  structure  permits 
the  access  of  global  data  structures  by  every  pro¬ 
cedure,  leading  to  inefficiencies  due  to  synchroniza¬ 
tion  of  accesses  to  such  structures.  Another  defi¬ 
ciency  is  the  lack  of  usage  of  modern  software  en¬ 
gineering  concepts  like  abstract  data  types  and  ob¬ 
jects,  and  generics  (as  can  be  implemented  by  Ada 
packages  and  C-l— 1-  classes).  The  use  of  such  con¬ 
structs  increases  layering,  improves  reusability,  and 
simplifies  development,  reengineering,  timing  analy¬ 
sis,  and  parallelism  extraction  [6,  15,  8].  The  hard¬ 
ware  model  makes  it  impossible  to  achieve  the  large 
scale  parallelism  necessitated  by  the  massive  capabil¬ 
ities  of  modern  software  systems.  Additionally,  the 
ATES-like  executive  systems  are  restrictive  in  terms 
of  portability  and  also  in  terms  of  services  provided 
to  the  application  program.  Additionally,  there  is 
no  clear  distinction  between  services  available  for  in¬ 
terfacing  with  the  hardware  and  those  provided  for 
the  application  program.  In  other  words,  one  major 
portability  issue  is  that  the  operating  system  func¬ 
tions  are  not  disjoint  from  the  application  run-time 
type  functions.  Another  issue  is  the  limitation  of 
the  memory  management  functions — the  executives 
often  do  not  provide  virtual  memory.  There  is  also 
room  for  improvement  in  real-time  scheduling  tech¬ 
niques.  Additionally,  the  degree  of  parallelism  man¬ 
aged  by  the  systems  is  very  low,  and  modern  paral¬ 
lelism  paradigms  such  as  asynchronous  remote  pro¬ 
cedure  call  [15]  are  not  supported.  Another  void  in 
the  systems  is  in  the  run-time  support  for  modern 
software  engineering  constructs  such  as  abstract  data 
types  and  abstract  data  objects. 
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Figure  4:  An  abstract  data  type  (ADT)  module. 


3  Desired  Characteristics  of  Reengi¬ 
neered  Systems 

Many  of  the  shortcomings  of  “yesterday’s”  system 
development  paradigms  have  been  overcome  by  mod¬ 
ern  paradigms.  Layering,  loose  coupling,  reuse  [6,  7], 
high  cohesion,  layering,  encapsulation,  and  informa¬ 
tion  hiding  are  facilitated  by  the  proper  use  of  pro¬ 
gramming  constructs  such  as  abstract  data  types  and 
abstract  data  objects.  The  ability  to  define  generic 
ADTs  and  ADOs  enables  the  development  of  pa¬ 
rameterized  abstractions,  resulting  in  increased  reuse 
and  in  a  high  payoff  when  such  a  component  is  pro¬ 
duced  by  reengineering  (since  the  cost  of  reengineer¬ 
ing  a  component  can  be  amortized  over  multiple  uses 
of  the  component).  The  specification  of  concurrency 
can  be  performed  in  modern  languages  such  as  Ada 
by  defining  tasks  within  ADT  or  ADO  modules.  The 
ability  to  lexically  nest  modules  reduces  the  visibil¬ 
ity  of  data  structures  to  only  those  needing  to  ac¬ 
cess  them,  thus  lowering  module  coupling,  simplify¬ 
ing  analysis  of  parallelism,  and  leading  to  systems 
with  fewer  bugs.  The  amount  of  direct  (assembly) 
code  is  minimized  in  modern  systems,  leading  to  in¬ 
creased  portability. 

Many  modern  software  engineering  paradigms 
supported  in  Ada,  Clu,  Modula-2  and  RESOLVE) 
provide  techniques  supporting  implementation  of 
templates.  Normally,  each  template  encapsulates  an 
abstract  data  type  (ADT).  A  typical  ADT  (as  illus- 


Figure  5:  A  facility 
module  instance. 
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Figure  9;  An  abstract  data  object  (ADO)  module. 


trated  in  Figure  4)  exports  (11  a  type  that  can  be 
used  to  declare  variables  and  (2)  operations  to  ma¬ 
nipulate  variables  of  the  provided  type.  To  use  an 
ADT  module,  it  is  instantiated  by  giving  it  a  name 
(see  Figure  5).  In  Ada,  for  example,  instantiation  is 
performed  with  the  use  and  with  clauses.  Instanti¬ 
ation  creates  a  module  instance,  or  a  facility.  With 
ADTs,  instantiation  does  not  create  a  data  value. 
Instead,  a  variable  must  be  explicitly  declared  to  be 
of  a  type  exported  by  a  facility,  and  may  be  accessed 
by  calling  operations  exported  by  the  module.  A  set 
of  variables  managed  by  a  module  instance  is  shown 
in  Figure  6.  ADT  instances  may  encapsulate  state 
that  is  accessible  only  by  the  operations  of  the  mod¬ 
ule,  as  shown  in  Figure  7.  Tailorable  templates  can 
be  developed  in  languages  permitting  development 
of  generic  modules  (modules  parameterized  by  types, 
by  operations,  or  by  other  modules).  A  generic  mod¬ 
ule  is  represented  graphically  in  Figure  8  A  generic 
module  is  instantiated  by  fixing  its  parameters. 

A  template  may  also  encapsulate  an  abstract  data 
object  (ADO),  the  unit  of  reuse  in  languages  such  as 
C+-1-,  DEAL,  Eiffel,  and  Smalltalk.  An  ADO  is  a 
special  case  of  an  ADT — one  that  exports  no  type 
and  that  encapsulates  state  (the  value  of  an  object), 
as  illustrated  in  Fi^u'es  9  and  10.  Thus,  ADOs  can 
also  be  developed  in  ADT-based  languages  such  as 
Ada,  Clu,  Modula-2  and  RESOLVE.  An  ADO  is  de¬ 
fined  using  a  template  module  (called  a  class  in  most 
object-oriented  languages)  which  exports  a  set  of  op- 


Figure  10;  A 
(ADO)  module 


51 


eratioQS,  and  defines  the  structure  of  objects  created 
from  the  class. 

In  the  Ada  programming  paradigm,  a  template 
may  encapsulate  tasks,  in  addition  to  either  ex¬ 
porting  a  type  or  encapsulating  state.  (See  Fig¬ 
ures  11  and  12.  Each  task  represents  a  unit  of 
parallelism.  When  encapsulated  in  a  template,  a 
task  becomes  active  upon  the  creation  of  an  instance 
of  the  template.  Tasks  proceed  independently  and 
may  execute  concurrently,  except  when  synchroniz¬ 
ing  ^rendezvousing!  with  other  tasks  at  pc' its  called 
entries.  Entries  allow  tasks  to  exchange  data  syn¬ 
chronously,  and  to  provide  mutually  exclusive  ac¬ 
cess  to  shared  state  information,  allowing  clients  to 
cause  synchronous  execution  of  a  specific  poicion  of 
a  task  body.  Exchanging  and  sharing  of  information 
between  two  tasks  can  either  be  done  conditionally 
or  unconditionally.  Only  one  client  may  be  active 
within  a  task  at  any  instant.  Furthermore,  a  tem¬ 
plate  may  export  one  or  more  task  types,  allowing 
many  copies  of  a  task  to  be  obtained  from  a  single 
task  object.  To  obtain  a  task  from  such  a  template, 
one  declares  a  variable  to  be  of  the  appropriate  task 
type  and  then  invokes  a  task  initiation  operation  on 
the  variable.  When  such  an  approach  is  used,  mul¬ 
tiple  tasks  may  be  active  concurrently  within  an  in¬ 
stance.  Tasks  templates  can  be  used  to  create  fami¬ 
lies  of  tasks  by  specifying  a  discrete  range  to  indicate 
the  indices  of  the  tasks  in  the  family. 

In  addition  to  the  explicit  parallelism  available  in 
a  task-based  system,  an  abundance  of  parallelism  is 
available  when  the  asynchronous  remote  procedure 
call  model  of  parallel  execution  is  applied  to  pro¬ 
grams  constructed  from  ADT  and  ADO  modules. 
Facilities  (consisting  of  code  and  possibly  state)  are 
statically  assigned  to  PEs  (multiple  facilities  may  re¬ 
side  on  the  same  PE).  Facilities’  operations  are  in¬ 
voked  by  sending  call  messages  between  PEs.  To  hide 
the  latency  of  a  remote  call,  an  operation  is  permit¬ 
ted  to  continue  execution  until  it  attempts  to  access 
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a  “locked”  variable  (this  model  of  parallel  execution 
is  termed  asynchronous  remote  procedure  call,  or 
ARPC  [15]).  A  variable  is  automatically  locked  when 
it  is  passed  as  a  parameter  to  a  call  and  is  unlocked 
upon  return  of  the  call.  An  operation  attempting  to 
access  a  locked  variable  must  wait  for  a  remote  call  to 
return  before  retrying  the  access.  ARPC  can  achieve 
parallel  execution  at  multiple  levels  in  the  abstrac¬ 
tion  hierarchy.  Thus,  potential  parallelism  within  a 
program  increraes  with  the  number  of  levels  of  ab¬ 
straction,  and  the  model  encourages  development  of 
highly  cohesive,  loosely  coupled  modules. 

With  the  increase  in  potential  concurrency  comes 
the  raided  complexity  of  exploiting  the  concurrency. 
Software  components  must  be  partitioned/clustered 
according  to  some  binding  relationships  (such  as 
communication,  concurrency  or  shared  data  access), 
and  the  clusters  assigned  to  processors  in  a  way  that 
causes  efficient  utilization  of  hardware  resources  and 
simultaneously  obeys  system  constraints  [11,  12,  13, 

9]- 

The  explicit  concurrency  available  in  task-based 
systems,  and  the  implicit  concurrency  available  via 
ARPC  can  be  exploited  on  modern  hardware  plat¬ 
forms,  which  are  characterized  by  a  large  number  of 
interconnected  processing  elements  (PEs).  For  ex¬ 
ample,  the  Intel  Paragon  computer  contains  thou¬ 
sands  of  computing  nodes,  running  according  to 
the  MIMD  paradigm,  and  interconnected  by  a  2- 
dimensional  mesh  network.  Its  computing  nodes  con- 
tMn  multiple  CPUs  that  share  memory.  Although 
there  is  shared  memory  within  a  node,  there  is  no 
globally  shared  memory.  Additionally,  one  CPU  per 
node  is  dedicated  to  communication  processing  and 
the  others  are  general-purpose  processors. 

The  modern  execution  model  provides  greater 
flexibility  than  the  older  execution  model.  Addi¬ 
tionally,  it  provides  the  same  services  that  the  old 
model  provided,  such  as  features  to  handle  hard¬ 
ware  interrupts,  memory  management,  application 
module  scheduling,  and  fault  tolerance.  Addition¬ 
ally,  the  system  provides  real-time  capabilities,  which 
provide  the  application  programs  with  a  guarantee 
of  resource  availability  for  critical  elements.  Fur¬ 
thermore,  the  execution  model  is  capable  of  support¬ 
ing  a  wide  variety  of  computer  architectures,  includ¬ 
ing  uniprocessors,  loosely  and  tightly  coupled  dis¬ 
tributed  architectures,  and  a  heterogeneous  mix  of 
systems.  Also,  the  model  aids  portability  and  seal- 
ability.  Two  execution  models  that  partially  satisfy 
this  group  of  requirements  are  the  Mach  Operating 
and  the  OSF/1  operating  system.  These  operating 
systems  provide  the  portability  and  scalability  that  is 
desired,  as  well  as  the  capability  to  operate  on  many 
different  architectures.  These  systems  emphasize  a 
layered  approach  of  providing  services  to  the  operat¬ 
ing  system.  The  operating  systems  are  designed  as 
highly  optimized  microkernels  in  which  functionality 
can  be  added  to  create  a  complete  operating  system. 

4  Transformation  of  Systems 


In  our  approach  to  reengineering,  we  increase  con¬ 
currency  while  maintaining  consistency  and  serializ- 
ability,  and  preventing  deadlock.  Figure  13  shows  a 
system  following  transformation. 

The  software  modules  described  in  Section  2  and 
depicted  in  Figure  2  can  be  described  using  generic 
packages  encapsulating  ADTs,  ADOs  and  task  types. 
There  at  least  two  approachs  which  can  be  used  to 
model  the  module  structures.  In  the  first  approach,  a 
module  is  represented  by  a  single  task  whose  task  en¬ 
tries  correspond  to  the  original  module  entries.  With 
this  approach,  obeisance  is  given  to  the  semantic  con¬ 
straint  which  dictates  that  a  single  entry  may  be  ac¬ 
tive  within  a  module  at  any  instant.  However,  due  to 
subtleties  in  the  Ada  task  rendezvous  construct,  the 
periodic  module  entry  cannot  be  accurately  modeled. 
Furthermore,  this  approach  does  not  take  advantage 
of  intramodule  concurrency.  In  the  second  approach, 
a  module  is  represented  by  a  packaage  containing  one 
task  for  each  module  entry.  Thus,  the  amount  of 
concurrency  allowed  is  increased,  but  consistency  of 
module  state  is  not  guaranteed. 

To  maintEun  consistent  values  for  module  level 
state  variables  as  well  as  global  variables,  semaphores 
are  used  to  insure  single  access  to  each  of  them. 
With  the  addition  of  the  semaphores  into  the  sys¬ 
tem,  the  potential  for  deadlock  is  introduced.  To 
prevent  deadlock,  module  managers  are  used  to  con¬ 
trol  accesses  to  module  entries.  For  example,  if  two 
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entries  a  and  6  of  the  same  module  could  enter  a 
circular  wait  state,  the  module  manager  may  not  al¬ 
low  a  and  6  to  execute  concurrently.  It  may  not  al¬ 
ways  be  necessary  to  be  this  strict  with  intramodule 
concurrency,  and  the  module  manager  could  make 
optimizations  where  appropriate  to  increase  concur¬ 
rency.  For  example,  segments  of  entry  a  could  per¬ 
haps  be  executed  concurrently  with  segments  of  en¬ 
try  b.  To  prevent  deadlock  when  accessing  global 
variables,  a  system  manager  is  used  to  control  con¬ 
currency  among  modules,  ESRs  and  CSRs  to  avoid 
circular  waiting  conditions. 

It  is  the  task  of  reengineering  to: 

•  insert  calls  to  semaphore  operations  before  and 
after  shared  variable  access; 

•  partition  global  and  module  variables  into  seg¬ 
ments  that  can  be  accessed  concurrently; 

•  generate  module  managers  that  maximize  in- 
tramodiile  concurrency,  subject  to  constraints 
such  as  liveness  and  serializability; 

•  generate  system  managers  that  maximize  inter¬ 
module  concurrency,  subject  to  constraints  such 
as  liveness  and  seri^izability; 

•  partition  module  entries  into  segments  that  can 
execute  concurrently;  and 

•  replicate  modules  and  procedures  to  maximize 
concurrency. 

Obviously,  software  reengineering  of  the  nature 
just  described  requires  more  than  a  translation  of 
source  code  written  in  one  language  into  source  code 
written  in  another  language.  Instead,  reengineer¬ 
ing  is  a  transformation  of  a  software  and  hardware 
system  from  one  paradigm  into  another.  To  per¬ 
form  such  a  transformation,  it  is  not  enough  to  have 
a  syntactic  knowledge  of  the  system  being  reengi¬ 
neered;  a  semantic  knowledge  must  also  be  obtained. 
TVansformation  of  a  system  implementation  from  one 
paradigm  to  another  involves  the  initial  task  of  re¬ 
verse  engineering,  that  is,  capturing  the  intermedi¬ 
ate  representation,  and  collecting  the  metrics  neces¬ 
sary  for  assessment  and  optimization.  Our  language- 
independent  representation  allows  the  redesign  of  a 
system  with  the  goals  of  employing  modern  software 
engineering  principles,  exploiting  parallelism  safely, 
emd  conforming  to  constraints  related  to  timing,  de¬ 
pendability  and  security.  In  this  Section  we  discuss 
the  metrics  that  must  be  collected  for  concurrency 
analysis  and  show  how  the  metrics  are  used  to  ex¬ 
ploit  parallelism. 

4.1  Characterizing  Complex  Systems 

When  reasoning  about  parallelism,  it  is  necessary 
first  to  analyze  parallelism  at  a  low  level  of  granular¬ 
ity,  i.e.,  at  the  statement  or  the  instruction  level.  To 
obtain  parallelism  information  units  of  larger  granu¬ 
larity,  one  successively  synthesizes  parallelism  infor¬ 
mation  from  lower  levels.  Our  representation  allows 
parallelism  to  be  exploited  at  the  following  levels: 


1.  among  statements  within  an  operation; 

2.  among  operations  within  an  instance; 

3.  among  operations  of  different  instances; 

4.  within  a  task  entry; 

5.  among  entries  within  a  task; 

6.  among  taisks  and  operations  within  a  package 
instance; 

7.  among  package  instances; 

8.  among  clusters  of  package  inst2Uices;  and 

9.  among  systems. 

We  have  defined  language-independent  interme¬ 
diate  forms  (or  metrics)  for  capturing  the  features 
of  complex  systems  that  are  essential  for  reasoning 
about  parallelism,  encapsulation,  information  hid¬ 
ing,  reuse,  real-time,  fault  tolersuice,  and  security. 
The  intermediate  representations  presented  in  this 
Section  can  be  derived  by  a  combination  of  compile¬ 
time  analysis  tools  and  run-time  monitoring  tools. 

Dependence  graphs  represent  program  statementj 
as  nodes  and  use  directed  edges  to  denote  state¬ 
ment  ordering  implied  by  the  dependences  in  a  source 
program.  Different  kinds  of  ordering  requirements 
are  represented  in  different  dependence  graphs.  The 
classical  data  dependence  graph  IDDG)  and  control 
dependence  graph  (CDG)  are  used,  in  addition  to  the 
facility  dependence  graph  (FDG)  which  was  invented 
for  the  purpose  of  clone  analysis  [14]. 

A  directed  acyclic  graph  (a  DAG)  is  used  to  show 
the  call  relationships  among  the  facilities  of  a  pro¬ 
gram  and  to  represent  inter-module  parallelism.  A 
program  is  modeled  by  a  DAG,  G  =  (V,  E),  where: 

1.  a  vertex  v  in  V  denotes  the  operations  of  a  fa¬ 
cility,  f(v); 

2.  an  edge  (x,  y)  in  E  indicates  that  the  code  of 
facility  f(x)  calls  some  operation(s)  provided  by 
facility  f(y). 

When  assessing  the  timing  properties  of  dis¬ 
tributed/parallel,  periodic,  time-constrained  pro¬ 
cesses,  it  is  important  to  capture  several  metrics  [9]. 
Obviously,  absolute  timing  constraints  such  as  pe¬ 
riods  and  deadlines  must  be  determined.  Addition¬ 
ally,  relative  timing  constraints  can  be  identified.  For 
scheduling  and  assignment/allocation  optimization, 
it  is  useful  to  represent  the  system  as  a  set  of  (au¬ 
tonomous)  activities,  cud  their  constituents,  which 
we  call  beads. 

The  module  call  DAG  is  used  to  represent  the 
fault  tolerance  and  security  properties  of  systems. 
The  degree  of  redundancy  is  given  for  nodes  and 
edges  of  the  graph,  indicating  the  number  of  redun¬ 
dant  software  modules  and  intermodule  access  paths, 
respectively.  Additionally,  the  reliability  attributes 
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of  modules  and  interconnections  are  specified  as  real- 
numbers  between  zero  and  one.  Security  is  repre¬ 
sented  as  classification  levels,  which  may  be  associ¬ 
ated  with  DAG  edges  and  nodes.  When  associated 
with  a  node,  a  security  level  indicates  the  classifica¬ 
tion  level  required  to  execute  the  code  represented 
by  the  node.  Classification  levels  on  edges  indicate 
the  minimum  degree  of  security  that  must  be  asso¬ 
ciated  with  any  transmission  of  the  call  parameters 
represented  by  the  edge. 

4.2  Concurrency  Analysis 

Given  the  statement  dependence  graphs  and  the 
module  call  DAG,  concurrency  information  is  ob¬ 
tained  first  at  the  statement  level.  That  information 
is  then  aggregated  to  the  procedure  level,  then  to  the 
module  level,  and  so  on,  until  concurrency  informa¬ 
tion  is  obtained  for  the  desired  level  of  granularity. 

Extraction  of  ARPC  parallelism  is  achieved  by 
first  augmenting  the  call  DAG  to  indicate  two  kinds 
of  parallelism.  An  edge  drawn  using  parallel  lines  in¬ 
dicates  parallelism  between  a  client  and  an  exporter. 
The  DAG  is  also  used  to  indicate  which  co-exporters 
of  a  client  can  execute  in  parallel  with  each  other; 
this  type  of  parallelism  is  denoted  by  labeling  ed^es 
with  sets  of  facilities.  To  enable  one  to  determine 
whether  two  arbitrary  facilities  can  execute  in  par¬ 
allel,  we  have  defined  theorems  that  state  how  the 
parallelism  information  contained  in  the  DAG  can 
be  propagated  among  nodes  of  the  graph  (for  details 
see  [11,  13]). 

Frequently,  an  ADT  module  is  needed  simultane¬ 
ously  by  multiple  clients,  thus  causing  contention. 
To  decrease  queueing  delays  to  execute  module  op¬ 
erations,  cloning  of  module  instances  (facilities)  is 
used  in  conjunction  with  the  ARPC  paradigm  [14]. 
The  detection  of  statements  that  contend  for  a  facil¬ 
ity  is  accomplished  by  considering  the  DDG,  CDG 
and  FDG  in  conjunction.  An  edge  in  an  FDG  shows 
where  it  is  beneficial  to  clone  a  facility,  assuming  that 
data  and  control  dependences  do  not  prohibit  paral¬ 
lelism  between  the  statements  involved  in  the  facility 
dependence. 

To  detect  contention  for  a  facility,  the  statements 
of  an  operation  are  partitioned  into  units.  A  unit  is 
a  sequence  of  statements  that  must  execute  in  order, 
due  to  the  data  dependences  among  them.  (Hence, 
the  statements  of  a  unit  cannot  contend  for  a  facility, 
but  different  units  may  contend  for  the  same  facility.) 
For  any  statement  S,  in  a  unit,  except  the  last  state¬ 
ment  of  the  unit,  5,  must  complete  execution  before 
Si+i  can  begin  execution.  Thus,  each  unit  can  utilize 
only  one  clone  of  each  of  the  facilities  that  it  uses. 
However,  if  two  units  can  execute  in  parallel,  they 
may  contend  for  clones.  Two  units  can  execute  in 
parallel  with  each  other  as  long  as  neither  is  an  an¬ 
cestor  of  the  other  (i.e.,  there  is  not  a  directed  path 
from  one  to  the  other).  This  notion  can  be  used  to 
construct  a  matrix,  P,  showing  which  units  can  run 
in  parallel.  The  information  in  the  matrix  is  used  to 
group  units  with  others,  such  that  each  member  of  a 
group  can  run  in  parallel  with  all  others.  Given  the 


groups,  the  parallelism  matrix  is  used  to  determine 
the  maximum  number  of  clones  that  can  be  used  si¬ 
multaneously  among  all  groups. 

After  the  software  components  of  a  complex  sys¬ 
tem  have  been  reengineered,  it  is  necessary  to  assign 
them  to  the  nodes  of  a  parallel  and/or  distributed 
computer  [11, 13,  9].  Due  to  the  large  number  of  soft¬ 
ware  and  hardware  components  in  complex  systems, 
the  cost  of  obtaining  an  optimal  assignment  of  soft¬ 
ware  components  to  processors  is  quite  high.  Thus, 
heuristics  are  used  for  assignment.  Furthermore,  it 
is  useful  to  cluster  components  before  assignment  in 
order  to  reduce  the  problem  size. 

We  have  developed  a  technique  to  distribute  the 
facilities  of  a  program  over  the  processing  elements 
(PEs)  of  distributed  memory  parallel  computers. 
The  technique  uses  a  random  neural  network  (RNN) 
to  assign  facilities  to  PEs  with  the  objectives  of  en¬ 
abling  maximum  parallelism  among  facilities,  and 
achieving  this  level  of  parallelism  with  the  minimum 
number  of  PEs  possible.  We  also  incorporate  mini¬ 
mization  of  communication  costs  into  the  objective 
function  by  using  a  prepass  to  the  neural  network. 
The  prepass  forms  clusters  of  heavily-communicating 
software  units.  Following  the  formation  of  clusters, 
the  random  neural  network  eissigns  clusters  to  PEs 
using  the  objectives  of  maocimum  parallelism  and  PE 
conservation. 

To  provide  the  required  dependability  and  secu¬ 
rity  properties,  one  must  insure  that  the  specifica¬ 
tion  is  heeded.  To  achieve  redundancy,  redundant 
copies  of  a  component  are  assigned  to  different  phys¬ 
ical  processors.  Similarly,  redundancy  in  software 
component  connections  is  achieved  by  a  careful  as¬ 
signment  of  components  to  processors.  Reliability  is 
achieved  by  selecting,  for  each  software  component, 
a  hardware  component  with  a  sufficiently  high  de¬ 
gree  of  reliability.  Note  that  the  communication  links 
must  also  be  considered  for  reliability  of  messages 
between  software  components.  Security  of  message 
flows  can  be  achieved  with  either  encryption  or  with 
dedicated  networks  for  each  security  class.  For  code 
modules,  security  is  attained  by  either  run-time  sys¬ 
tem  mechanisms  or  by  dedicated  code  processors  for 
each  security  level. 

5  Conclusions 

We  have  described  a  framework  for  automated 
reengineering  of  complex  systems.  Our  intermediate 
representation  for  complex  systems  is  not  linked  with 
any  specific  combination  of  programming,  hardware 
or  execution  paradigms.  Furthermore,  it  allows  the 
capture  of  essential  system  metrics  relating  to  par¬ 
allelism,  timing,  fault  tolerance,  and  security.  The 
representation  enables  the  incorporation  of  object- 
based  software  engineering  techniques  during  reengi¬ 
neering.  Additionally,  we  describe  how  the  repre¬ 
sentation  is  used  to  enhance  parallelism  during  the 
reengineering  process,  to  assess  timing  properties,  to 
achieve  dependability  and  security,  and  to  optimize 
software-to-hardware  assignments. 
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We  continue  to  evolve  our  reengineerin,"  analysis 
tools,  which  drive  the  evolution  of  the  metrics.  Thus, 
additional  metrics  for  parallelism  analysis  continue 
to  be  identified.  We  are  also  beginning  to  develop 
reverse  engineering  tools  to  capture  the  metrics  for 
languages  such  as  assembler,  CMS-2,  and  Ada.  Fur¬ 
thermore,  we  plan  to  develop  reengineering  tools  that 
transform  previous  generation  complex  systems  into 
layered,  highly  concurrent  designs  and  to  feed  these 
designs  to  backend  code  generators  for  languages 
such  as  Ada,  Smalltalk  and  Eiffel.  Finally, 

we  plan  to  use  parallel  processors  such  as  the  In¬ 
tel  Paragon  to  implement  the  synchronization  and 
deadlock  avoidance  techniques  incorporated  in  the 
module  and  system  managers. 
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Abstract 

We  are  developing  an  authoring  tool  called  I-Doc 
that  will  automate  the  process  of  generating  docu¬ 
mentation  smd  user  help  for  software  systems.  The 
focus  of  the  tool  is  on  capture  of  the  requirements 
and  design  decisions  that  form  the  content  of  soft¬ 
ware  documentation.  This  information  can  then  be 
used  to  generate  summaries  and  explanations  of  the 
software  on  demand.  The  objective  of  this  research 
is  to  provide  on-line  assistance  for  software  maintain- 
ers  and  other  software  professionals  that  can  take  the 
place  of  conventional  bulk  documents.  I-Doc  is  de¬ 
signed  to  support  the  reengineering  of  software  sys¬ 
tems,  since  some  of  the  necessary  design  information 
will  have  to  be  captured  by  annotating  existing  code. 
Reengineering  technology,  specifically  transformation 
technology,  is  employed  during  the  generation  process 
to  simplify  and  reorganize  design  information  when 
describing  software. 

1  Introduction 

Conventional  documentation  for  software  systems 
has  surprisingly  little  value,  given  the  amount  of  time 
and  effort  spent  to  create  it.  This  is  particularly  true 
for  large,  mature  systems.  Such  systems  typically  have 
voluminous  design  documents,  in  which  it  is  difficult  to 
find  information  relevant  to  any  specific  maintenance 
task.  If  the  documentation  is  not  maintained  in  lock 
step  with  the  code,  it  quickly  becomes  inaccurate,  so 
maintainers  cannot  rely  upon  it.  There  is  an  increas¬ 
ing  need  for  alternative  technologies  that  can  provide 
maintainers  and  users  of  software  systems  with  the 
information  that  they  need  to  operate  and  maintain 
those  systems. 

We  are  developing  a  documentation  authoring  tool 
that  will  automate  the  process  of  generating  documen¬ 


tation  and  user  help  for  software  systems.  This  tool 
will  result  in  dramatic  improvements  in  the  way  doc¬ 
umentation  is  developed,  maintained,  and  used.  If 
the  design,  requirements,  and  assumptions  underly¬ 
ing  code  are  made  explicit,  generation  of  documenta¬ 
tion  can  be  substantially  automated.  This  underlying 
knowledge  can  be  acquired  and  formalized  in  a  natu¬ 
ral,  incremental  fashion  that  does  not  overly  burden 
developers. 

Documentation  will  be  generated  dynamically,  in 
response  to  specific  requests  for  user  information. 
When  the  user  requests  information,  the  documen¬ 
tation  system  determines  what  information  content 
should  be  presented.  It  composes  a  response  by  com¬ 
bining  textual  descriptions  previously  entered  in  a 
database,  and  automatically  generating  natural  lan¬ 
guage  output  to  fill  in  the  rest.  The  content  of  the 
generated  output  depends  upon  the  level  of  expertise 
of  the  user,  and  the  history  of  previous  user  docu¬ 
mentation  requests.  This  fundamentally  changes  the 
nature  and  role  of  documentation.  There  will  be  less 
need  for  users  to  search  through  bulk  documents  in 
order  to  obtain  answers  to  specific  questions.  Instead, 
the  documentation  system  will  search  its  own  knowl¬ 
edge  base  for  the  information  that  the  user  requires, 
and  compose  explanations  meeting  the  user’s  needs. 

Other  CASE  (computer-aided  software  engineer¬ 
ing)  tools  have  been  developed  to  support  the  author¬ 
ing  of  documentation.  These  tools  differ  in  that  they 
tend  to  be  oriented  toward  the  generation  of  specific 
reports,  such  as  those  mandated  by  Department  of 
Defense  procurement  standards.  They  make  it  eas¬ 
ier  to  produce  such  reports,  but  that  does  not  make 
the  reports  themselves  significantly  more  useful.  The 
I-Doc  approach  is  designed  to  make  such  reports  un¬ 
necessary  for  most  purposes,  although  it  will  still  be 
possible  to  generate  bulk  reports  from  I-Doc’s  design 
repository. 

The  approach  emplo}-  reengineering  technology. 
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and  is  designed  to  be  compatible  with  reeiiginetting  ef¬ 
forts.  Information  required  to  support  documeritaf  ion 
is  entered  in  a  reengineering  knowledge  base,  in  tlie 
form  of  annotations  on  source  code  purse  trees.  'I'lans- 
formations  are  employed  to  generate  siniplified  and  re¬ 
organized  sections  of  code  that  highlight  the  aspects 
of  the  system  being  documented.  Design  recov'.ry  ac¬ 
tivities  have  the  effect  of  bringing  documentation  up 
to  date,  facilitating  subsequent  software  maintenance. 

2  The  State  of  Current  Documenta¬ 
tion 

Let  us  examine  the  problems  associated  with  con¬ 
ventional  software  documentation,  to  see  how  new 
techniques  can  alleviate  those  problems. 

The  primary  emphasis  of  conventional  system  doc¬ 
umentation  is  on  amassing  information.  Documenta¬ 
tion  standards,  especially  government  standards  such 
as  MIL-STD-2167A  or  SDD,  require  developers  sys¬ 
tematically  to  describe  all  details  of  a  design  such  as 
the  inputs  and  outputs  of  each  function.  The  structure 
of  such  documents  is  fixed  and  standardized. 

The  first  problem  with  such  system  documentation 
is  that  is  not  sufficiently  activity-oriented,  i.e.,  it  is 
not  designed  to  support  the  activities  of  the  intended 
readership.  User  manuals  are  activity-oriented  in  this 
sense;  such  manuals  are  designed  to  help  people  whose 
activity  is  to  use  the  software  system.  System  docu¬ 
mentation  is  not,  or  if  it  is  the  set  of  activities  being 
supported  is  incomplete.  System  documentation  can 
potentialy  support  a  number  of  activities,  including 
the  following; 

•  review  by  the  customer  to  check  that  all  stated 
requirements  are  met  in  the  design, 

•  design  reviews  in  which  the  quality  and  validity 
of  the  design  is  evaluated,  and 

•  maintenance  activities,  in  which  maintainers  seek 
to  obtain  information  about  the  design  so  that 
modifications  and  enhancements  can  be  per¬ 
formed  correctly. 

These  activities  are  very  different,  yet  documents 
often  must  support  more  than  one  of  them.  The  ac¬ 
tivity  that  is  least  supported,  of  course,  is  mainte¬ 
nance.  Maintenance  manuals  are  common  for  physical 
devices,  but  are  rare  for  software  systems.  Of  course 
it  is  ha''der  to  write  maintenance  manuals  for  soft¬ 
ware  than  it  is  for  devices,  because  maintenance  tasks 


change  as  the  software  evolves  Nevertheless,  main- 
tenarce  activities  in  general  are  vastly  different  from 
specification  and  design  reviews.  Maintainers  rarely 
perform  methodical  reviews  of  entire  systems;  rather, 
t’.ey  inspect  specific  modules  in  detail  in  order  to  de¬ 
termine  how  they  can  be  modified.  Interrelationships 
between  modules  can  be  extremely  important.  Doc¬ 
uments  "'Itch  as  design  documents,  which  describe  all 
components  in  a  uniform  way,  are  more  suited  to  de¬ 
sign  activities  than  maintenance  activities. 

In  addition  to  being  activity-oriented,  good  docu¬ 
mentation  is  task-onented,  i.e.,  designed  to  help  read¬ 
ers  perform  specific  tasks.  Tutorial  user  manuals  are 
frequently  written  in  a  task-oriented  fashion.  For  ex¬ 
ample,  a  word  processor  manual  might  havv  the  user 
work  through  sample  tasks  such  as  composing  a  busi¬ 
ness  letter  or  printing  mailing  labels. 

A  task-oriented  approach  centered  on  hypothetical 
taisks  is  not  necessarily  the  best  way  to  design  doc¬ 
umentation  in  general.  It  requires  the  reader  to  take 
the  time  to  work  through  exercises,  whereas  document 
users  typically  are  impatient  and  skip  through  the  doc¬ 
umentation  trying  to  find  out  what  they  need  so  they 
can  get  on  with  their  actual  job.  This  is  the  motivation 
for  the  new  “minimalist”  approach  to  documentation, 
which  uses  overviews,  structured  exercises,  and  any 
information  that  the  user  cannot  discover  through  ex¬ 
perimentation  with  the  system  [3].  However,  the  min¬ 
imal  approach  is  not  a  rejection  of  task  orientation  per 
se,  just  of  manuals  that  are  oriented  around  lengthy 
hypothetical  exercises  and  that  contain  information 
one  can  figure  out  on  one’s  own. 

It  is  certainly  true  that  system  documentation  can 
be  improved  simply  by  learning  lessons  from  other 
types  of  documentation  such  as  user  documentation. 
However,  even  well-written  paper  documents  suffer 
from  basic  limitations.  A  person  writing  a  document 
can  only  make  rough  guesses  about  what  tasks  the 
reader  might  be  performing,  what  information  he  or 
she  might  want  to  know,  and  the  level  of  expertise  of 
the  reader.  Detailed  exercises  can  help  eliminate  the 
guesswork — ^if  the  reader  works  through  an  exercise, 
the  writer  can  try  to  anticipate  what  kinds  of  ques¬ 
tions  the  reader  might  want  to  ask  at  each  point  in 
the  exercise.  This  does  not  work  if  readers  lack  the 
patience  to  work  through  the  exercises,  as  the  mini¬ 
malists  argue. 

The  key  to  a  substantial  improvement  in  documen¬ 
tation  is  an  on-line  system  that  can  construct  presen¬ 
tations  dynamically,  reducing  the  reliance  on  guess¬ 
work.  Interaction  with  the  system  should  be  in  the 
form  of  a  question- answer  dialog;  that  way,  the  reader 


indicates  to  the  system  what  he  or  she  wants  to  know. 
The  system  can  present  information  in  the  context 
of  the  reader’s  activities,  simply  by  asking  the  reader 
questions  about  those  activities.  If  the  reader  does 
not  understand  the  descriptions  generated  by  the  sys¬ 
tem,  the  reader  should  be  able  to  request  a  clarifica¬ 
tion,  and  have  the  system  adjust  its  estimate  of  the 
reader’s  level  of  expertise.  The  process  of  supporting 
documentation  then  becomes  less  an  activity  of  writ¬ 
ing  text  and  more  an  activity  of  providing  the  system 
with  the  information  that  it  needs  to  produce  a  range 
of  descriptions  of  the  system. 

Such  a  capability  constitutes  a  clear  advance  of  the 
state  of  the  art  in  documentation  support.  However, 
the  technologies  needed  to  realize  such  a  capability, 
such  as  design  repositories,  hypertext,  program  analy¬ 
sis  and  transformation,  and  natural  language  genera¬ 
tion,  are  well  developed  and  in  a  state  where  they  can 
be  brought  to  bear  effectively  on  the  documentation 
problem. 

3  An  Example  from  the  Reader’s 
Standpoint 

The  following  example  illustrates  how  I-Doc  is  in¬ 
tended  to  function,  from  the  viewpoint  of  the  reader, 
i.e.,  the  person  asking  questions  about  the  system. 

The  system  in  question  is  real-time  embedded  con¬ 
trol  software  of  a  fighter  aircraft  radar  system.  This 
example  was  studied  by  Hughes  in  a  research  effort 
sponsored  by  Wright  Patterson  Air  Force  Base  [4]. 
Hughes  built  a  demonstration  hypertext  documenta¬ 
tion  system  to  support  a  hypothetical  maintenance 
task  on  this  system.  We  have  been  using  the  same  ex¬ 
ample  as  an  initial  test  case,  to  design  I-Doc  so  that  it 
can  generate  descriptions  automatically  that  aie  sim¬ 
ilar  to  what  the  Hughes  group  constructed  manually 
in  their  demonstration. 

One  of  the  functions  of  the  radar  system  software 
is  a  Range  While  Search  function,  which  electronically 
controls  how  the  aircraft’s  radar  scans  the  airspace. 
Normal  Range  While  Search  scans  a  volume  of  air 
space  that  is  wider  in  azimuth  than  it  is  in  elevation, 
say  60  degrees  to  the  left  and  right  of  the  aircraft,  and 
10  degrees  above  and  below  the  horizon.  The  volume 
is  scanned  by  sweeping  back  and  forth  horizontally, 
top  to  bottom.  The  hypothetical  maintenance  task  is 
to  change  the  code  so  that  it  can  scan  volumes  that 
are  wider  in  elevation  than  in  azimuth,  by  scanning 
vertically  rather  than  horizontally. 

Figure  1  shows  the  window  that  the  user  interacts 
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with  in  order  to  initiate  the  query.  The  user  types  in 
the  name  of  the  component  that  he  or  she  is  interested 
in.  Mechanisms  for  selecting  components  from  a  menu 
of  alternatives  will  also  be  provided. 

Before  I-Doc  can  accept  a  query,  however,  it  first 
requests  information  about  the  user  and  the  task  be¬ 
ing  performed.  The  user  parameters  are  input  via  a 
menu  such  as  that  shown  in  Figure  2.  The  user  is 
requested  to  indicate  what  role  the  user  plays  on  the 
project,  and  indicates  Maintainer.  I-Doc  will  there¬ 
fore  include  in  the  system  descriptions  that  it  gener¬ 
ates  information  relevamt  to  maintenance,  e.g.,  inputs, 
outputs,  and  functional  decomposition  of  each  mod¬ 
ule.  If  the  user  h2ul  chosen  a  different  selection,  such 
as  User,  descriptions  would  be  more  functional  in  na¬ 
ture,  and  limited  to  those  aspects  of  functionality  that 
would  be  visible  to  the  user  (in  this  case  the  pilot  or 
radar  intercept  officer  responsible  for  controlling  and 
monitoring  the  radar). 

Additionally  1-Doc  requires  an  initial  estimate  of 
the  user’s  degree  of  familiarity  with  the  system.  In 
this  case  the  user  selects  Low,  which  causes  I-Doc  to 
limit  the  extent  to  which  it  refers  to  implementation 
details  such  as  data  representations. 

Next,  I-Doc  requests  a  characterization  of  the  task 
the  user  is  performing.  Four  types  of  activities  are 
known  relating  to  system  maintenance:  adding  func¬ 
tionality,  fixing  bugs,  optimizing,  and  validating  doc¬ 
umentation.  Add  Functionality  is  the  choice  in  this 
case.  In  this  context,  it  causes  I-Doc  to  generate  high- 
level  overviews  of  the  functionality  in  question.  If  Fix 
Bug  or  Optimize  were  chosen,  the  description  would 
focus  more  narrowly  on  those  system  components  in¬ 
volved  in  generating  the  behavior  that  must  be  opti¬ 
mized.  Validate  Documentation  is  chosen  when  the 
user  (typically  a  developer)  wishes  to  see  a  variety  of 
descriptions  generated  by  I-Doc,  to  verify  that  the  sys¬ 
tem  can  generate  valid  documentation  in  each  case. 
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Figure  2:  User  Parameter  Window 


Figure  3  shows  a  sample  output,  based  upon  the 
parameters  selected  above.  I-Doc  cannot  yet  generate 
this  output,  as  the  project  is  just  getting  started;  this 
is  merely  an  illustration  of  the  type  of  output  that  will 
be  generated.  The  figure  is  a  display  generated  by  the 
Mosaic  hypertext  system  [2],  which  is  the  hypertext 
system  used  as  an  output  interface  by  I-Doc.  The  fig¬ 
ure  contains  a  simplified  decomposition  diagram  show¬ 
ing  the  major  components  of  Range-While-Scan:  Scan 
Generation  and  Output  Processing.  It  summarizes  the 
function  of  each  component,  and  the  main  inputs  and 
outputs  of  each. 

Several  points  are  illustrated  by  this  output  exam¬ 
ple.  First,  the  output  is  selective  both  in  terms  of 
what  components  of  Range- While-Scan  are  described, 
and  what  properties  of  those  components  are  men¬ 
tioned.  In  this  overview  the  major  components  of 
Range-While-Scan  are  shown,  but  not  those  compo¬ 
nents  responsible  for  checking  and  reporting  errors. 
(For  example,  if  Output  Processing  detects  erroneous 
radar  input,  an  error  is  signaled.)  It  characterizes 
the  function  of  the  components  (e.g,  Scan-Generation 
creates  a  scan  pattern),  and  the  inputs  and  outputs 
of  each  component.  If  the  task  or  user  parameters 
were  different,  the  summary  would  have  changed  ac¬ 
cordingly,  perhaps  including  more  detailed  informar 
tion  about  the  system. 

Because  the  presentation  medium  is  hypertext,  it  is 
not  necessary  to  enumerate  all  relevant  properties  of 
Range- While-Scan.  It  is  sufficient  to  provide  hyper¬ 
text  links  which,  if  selected,  will  permit  the  reader  to 
obtain  further  information.  Some  of  these  buttonable 
items  are  interspersed  through  the  text,  and  appear 
underlined  in  the  figure.  Other  items  appear  at  the 
bottom.  Because  Range-While-Scan’s  performance  re¬ 
quirements  are  particularly  important  for  anyone  at¬ 
tempting  to  add  functioneJity  to  it,  a  special  hypertext 
link  is  included  to  access  this  information.  Other  rel¬ 
evant  topics  are  included  at  the  bottom.  The  links 
named  “electronically  controlled  radars”  and  “radar 
data  processing”  provide  background  about  the  ap¬ 
plication  domain  that  might  be  useful  to  a  main- 
tainer  who  is  unfamiliar  with  the  application.  Below 
are  listed  links  for  obtaining  more  information  about 
Range-While-Scan’s  components.  Further  down,  be¬ 
low  the  bottom  of  the  scrolling  window  in  this  exeun- 
ple,  are  pointers  that  allow  the  reader  to  see  the  source 
code  from  which  this  description  is  derived,  either  the 
full  text  or  a  simplified  version  corresponding  to  what 
appears  in  this  hypertext  description. 
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4  System  Architecture 

I-Doc  contains  the  following  major  functions. 

•  An  acquisition  interface  is  used  to  input  the  anno¬ 
tations  necessary  to  generate  system  descriptions. 

•  This  information  is  stored  in  a  repository,  and 
in  annotations  embedded  within  the  source  code 
itself. 

•  A  query  interface,  as  shown  in  Figure  1,  is  used 
to  input  queries  from  the  user. 

•  The  source  code  and  repository  are  processed  to 
extract  the  information  to  be  presented. 

•  A  presentation  layout  for  the  information  is  con¬ 
structed. 

•  The  presentation  layout  is  displayed  as  hyper¬ 
text.  Requests  to  traverse  hypertext  links  are  in¬ 
tercepted  and  passed  back  to  the  extraction  and 
layout  subsystems  to  generate  new  presentations. 

These  components  will  be  described  in  further  detail 
below,  but  first  the  information  content  that  these 
components  operate  on  will  be  discussed. 

5  Underlying  Knowledge 

In  order  to  generate  appropriate  software  descrip¬ 
tions,  I-Doc  requires  a  variety  of  information  about  the 
software  and  its  design.  Some  of  this  information  can 
be  extracted  directly  from  the  code  and  from  CASE 
repositories.  Other  information  must  be  added  to  the 
design  in  the  form  of  annotations. 

First,  a  hierarchical  decomposition  of  the  design 
into  functional  components  is  required,  vogether  with 
the  types  of  the  components’  inputs  and  outputs.  The 
pattern  of  data  flow  among  components  is  necessary 
as  well. 

In  order  to  present  the  information  flow  between 
modules  in  natural  language,  some  additional  char¬ 
acterizations  of  the  data  and  the  operations  on  the 
data  are  required.  First,  it  is  useful  to  classify  the 
type  of  operation  being  performed  by  the  component. 
Classifications  that  have  been  identified  so  far  as  use¬ 
ful  include  create,  destroy,  filter,  insert,  remove,  re¬ 
trieve,  process,  and  validate.  For  example,  the  mod¬ 
ule  Scan-Generation  is  classified  as  creating  scan  pat¬ 
terns.  Modules  whose  function  is  to  validate  data  were 
omitted  from  the  summary  in  Figure  3,  under  the  as¬ 
sumption  that  initial  descriptions  should  assume  that 


all  data  is  valid,  and  methods  for  handling  exceptional 
data  will  be  described  later.  The  content  and  use  of 
data  structures  is  characterized  as  well.  Data  struc¬ 
tures  are  categorized  as  to  whether  they  represent  ob¬ 
jects,  aggregates  of  objects  (sets,  sequences,  etc.),  or 
names  of  objects.  This  enables  I-Doc  to  refer  to  the 
output  of  Output-Processing  as  a  set  of  contacts,  re¬ 
gardless  of  the  actual  data  representation  (e.g.,  an  ar¬ 
ray  of  pointers  to  contact  objects). 

Function  categorizations  cztn  be  applied  to  seg¬ 
ments  of  components  or  groups  of  components,  as  well 
as  to  individual  components.  For  example,  a  set  of 
routines  may  be  employed  to  process  radar  data,  or 
>\  set  of  statements  may  be  employed  to  validate  the 
data. 

Another  type  of  information  that  plays  a  prominent 
role  in  I-Doc  descriptions  is  information  about  require¬ 
ments,  particularly  nonfunctional  requirements.  A  set 
of  attributes  such  as  speed  requirements  or  accuracy 
requirements  may  be  associated  with  functional  com¬ 
ponents  and  data. 

In  order  to  determine  how  to  render  data  dictionary 
elements  most  effectively  in  natural  language,  I-Doc 
uses  grammatical  annotations.  The  annotations  used 
in  I-Doc  are  based  on  those  used  in  the  ARIES  require¬ 
ments  acquisition  system  for  annotating  specifications 
[7].  Data  expressing  relationships  between  objects  are 
categorized  as  to  whether  they  are  attributes,  actions, 
circumstances,  classes,  or  relations.  Attributes  de¬ 
scribe  properties  of  the  object,  actions  describe  actions 
that  involve  the  object,  circumst^ulces  describe  states 
of  the  object,  classes  identify  categories  to  which  the 
object  belongs,  2Uid  relations  are  default  data  relation¬ 
ships.  Objects  participating  in  such  relationships  can 
assume  one  of  several  grammatical  categories,  e.g.,  ac¬ 
tor,  goal,  location,  or  beneficiary.  These  categories  are 
drawn  from  case  grammars  for  natural  language  [5]. 

Finally,  descriptions  of  classes  of  behavior,  called 
scenarios,  are  useful  in  the  description  process.  Sce- 
nuios  are  useful  for  defining  system  requirements,  as 
a  way  of  describing  types  of  behavior  that  a  system 
should  or  should  not  exhibit  [1],  which  can  be  used  to 
validate  system  specifications.  They  are  intended  to 
serve  two  roles  within  I-Doc.  First,  scenarios  can  be 
used  to  illustrate  system  behavior.  Second,  scenarios 
provide  tbe  context  in  which  to  describe  systems.  If 
the  I-Doc  user  is  trying  to  fix  a  bug,  for  example,  then 
if  a  scenario  illustrating  the  bug  is  available  it  can  be 
used  by  I-Doc  to  focus  on  describing  those  components 
of  the  system  that  are  relevant  to  the  bug. 
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6  Knowledge  Acquisition 

The  information  described  above  is  acquired  from  a 
variety  of  sources.  These  sources  will  be  described  be¬ 
low.  It  is  important  to  emphasize,  though,  that  I-Ooc 
can  still  generate  comprehensible  system  descriptions 
without  much  of  this  information.  The  added  design 
information  is  used  to  improve  the  quality  of  system 
descriptions,  and  can  be  acquired  when  and  as  appro¬ 
priate. 

Some  information  is  available  in  front-end  CASE 
tools  such  as  Software  through  Pictures  [6].  I- Doc 
will  have  the  ability  to  query  one  or  more  such  CASE 
repositories  in  order  to  extract  such  information  if 
available. 

Another  means  of  acquiring  the  documentary  in¬ 
formation  is  through  a  special  acquisition  interface. 
This  method  is  used  especially  for  inputting  grammat¬ 
ical  annotations  and  design  component  classifications. 
These  annotations  need  not  be  selected  directly;  in¬ 
stead,  the  person  entering  the  information  can  request 
that  I-Doc  attempt  to  provide  the  annotations  au¬ 
tomatically,  and  present  samples  of  natural  language 
output  based  upon  those  annotations.  Although  the 
grammatical  annotations  are  based  on  linguistic  con¬ 
cepts  that  may  be  unfamiliar  to  software  engineers, 
it  is  easy  to  see  when  the  generated  natural  language 
is  awkward  or  incorrect.  Once  the  user  has  selected 
from  among  alternative  descriptions  generated  by  I- 
Doc,  I- Doc  saves  the  annotations  used  to  produce  that 
sample  output. 

There  are  three  other  ways  in  which  information 
for  constructing  system  descriptions  is  obtained.  One 
means  is  through  the  use  of  object  hierarchies.  If  an 
object  class  has  a  particular  set  of  attributes,  its  spe¬ 
cializations  are  likely  to  have  similar  attributes.  A 
second  approach  is  to  annotate  the  models  used  in 
automated  program  synthesis  systems.  The  knowl¬ 
edge  bases  of  specialized  knowledge-based  synthesis 
systems,  such  as  user  interface  development  systems, 
can  be  augmented  to  support  the  generation  of  docu¬ 
mentation  and  help  as  well  [12].  Integration  of  I-Doc 
with  one  or  more  such  systems  is  an  option  being  con¬ 
sidered  for  future  development. 

The  third  source  of  design  information  for  docu¬ 
mentation  is  code  analysis.  Analysis  routines  can  de¬ 
tect  components  that  appear  to  be  creating  objects, 
inserting  into  or  removing  from  data  aggregates,  val¬ 
idating  data,  etc.  Such  analysis  is  further  facilitated 
when  some  design  components  are  already  annotated; 
e.g.,  when  data  is  designated  as  an  aggregate,  it  makes 
sense  to  look  for  routines  that  add  and  remove  el¬ 
ements  from  that  aggregate.  The  analysis  capabili¬ 


ties  of  Reasoning  Systems’  reengineering  technology, 
in  particular  Refine/Ada,  are  being  used  as  the  basis 
for  such  analysis  capability. 

Use  of  the  above  capabilities  for  extracting  rele¬ 
vant  design  annotations  is  one  way  in  which  reengi¬ 
neering  technology  plays  a  prominent  role  in  the  con¬ 
struction  of  documentation  in  I-Doc.  In  fact,  in  order 
for  automated  generation  of  documentation  to  suc¬ 
ceed  it  must  be  viewed  at  least  partly  as  a  reverse 
engineering  enterprise.  Preparing  a  design  for  doc¬ 
ument  generation  involves  adding  design  information 
that  is  not  present  in  the  original  design.  Front-end 
CASE  tools,  executable  specification  languages,  and 
other  advanced  forward  engineering  technologies  can 
reduce  the  amount  of  information  that  must  be  recap¬ 
tured,  but  even  then  some  information  must  be  made 
explicit  that  is  implicit  in  the  design.  Thus  a  combina¬ 
tion  of  interactive  design  capture  and  design  recovery 
techniques  appear  to  be  essential. 


7  Repository  Storage  and  Mainte¬ 
nance 

As  design  information  is  acquired  it  must  be  associ¬ 
ated  with  the  code  and  maintained.  The  basic  mecha¬ 
nism  being  used  to  achieve  this  is  to  add  attributes  to 
the  Refine  representation  of  program  parse  trees.  In 
order  to  permanently  associate  these  attributes  with 
the  code,  the  following  techniques  are  planned.  First, 
Ada  pragmas  will  be  used  to  insert  the  information  di¬ 
rectly  into  the  code.  Pragmas  were  originally  intended 
to  record  information  to  guide  compilers  in  generating 
object  code.  In  an  analogous  fashion  they  can  be  used 
to  guide  the  generation  of  system  descriptions. 

In  the  longer  term,  it  may  be  appropriate  to  ex¬ 
tend  the  data  model  of  a  CASE  tool  such  as  Software 
through  Pictures  in  order  to  record  the  design  infor¬ 
mation.  However,  that  should  not  be  a  substitute  for 
inserting  design  information  directly  into  the  source 
code.  In  order  for  the  captured  design  information  to 
be  useful,  it  must  continue  to  be  dissociated  with  the 
code  as  it  is  maintained  over  time.  At  the  present  time 
the  best  way  of  ensuring  this  is  to  integrate  the  design 
information  into  the  source  code,  so  that  maintenance 
using  a  front-end  CASE  tool  is  not  required.  For  lan¬ 
guages  that  do  not  have  constructs  similar  to  pragmas, 
the  approach  will  be  to  add  grammatical  extensions  to 
the  source  language  to  provide  such  constructs,  which 
can  then  be  removed  from  the  source  code  via  trans¬ 
formations,  using  a  tool  such  as  Refine. 


8  Inputting  User  Queries 

If  the  design  knowledge  associated  with  a  design  is 
sufficiently  rich,  it  can  be  used  as  the  basis  for  answer¬ 
ing  a  variety  of  queries.  Following  the  approach  taken 
in  Lehnert’s  original  work  on  question  answering  [8] 
and  further  developed  by  research  in  expert  system 
explanation,  such  as  the  work  of  Moore  and  Swartout 
[10],  common  questions  about  software  have  been  cate¬ 
gorized  into  types.  In  the  initial  version  of  I-Doc,  these 
question  types  will  be  available  to  the  user  as  explicit 
choices  from  which  the  user  can  choose.  Examples 
of  question  types  include  Describe  Function,  Describe 
Design,  Describe  Interface,  and  Describe  Use.  In  sub¬ 
sequent  versions  these  choices  will  be  automated  to 
a  greater  extent,  so  that  the  user  can  simply  request 
“Describe”  and  the  system  will  construct  a  combined 
description  appropriate  to  the  context. 

The  main  difference  between  question  input  in  1- 
Doc  and  question  input  in  expert  system  explanation 
is  the  use  of  hypertext  as  a  medium  for  posing  ques¬ 
tions.  Question-answering  systems  such  as  Moore  and 
Swartout’s  facility  in  the  Explainable  Expert  System 
(EES)  system  allow  the  user  to  point  to  an  element 
of  a  description  and  ask  one  of  several  follow-up  ques¬ 
tions  about  ■ In  hypertext,  the  interaction  is  more 
limited — the  user  clicks  on  a  -,ection  of  text  where  a 
link  begius,  causing  the  system  to  jump  to  the  other 
end  of  iink.  The  user  does  not  have  the  option  of 
selecting  one  of  several  operations  to  perform  on  the 
link.  In  order  to  overcome  this  difficulty,  we  are  ex¬ 
perimenting  with  making  choices  explicit  as  lists  of  se¬ 
lect;  ,ble  items  at  the  bottom  of  the  hypertext  display, 
as  s  lown  in  Figure  3.  It  remains  to  be  seen  how  effec¬ 
tive  this  technique  is  in  :omparison  to  more  ordinary 
menu-based  approaches. 


9  Extracting  Relevant  Information 

Once  the  type  and  the  object  of  the  query  have 
been  chosen,  the  information  suitable  for  inclusion 
in  the  presentation  must  be  retrieved  and  presented. 
Retrieval  of  relevant  information  can  be  easily  ac¬ 
complished  using  the  retrieval  and  query  mechanisms 
available  in  Refine,  Software  through  Pictures,  and 
other  tools.  Difficulties  arise,  though,  when  the 
amount  of  retrieved  information  is  too  much  to  present 
in  such  a  way  that  the  reader  can  assimilate  it.  In 
such  ceises  filtering  techniques  may  be  employed  to  fo¬ 
cus  on  the  information  of  greatest  potential  interest  to 
the  reader. 


The  filtering  procedures  in  1-Doc  proceed  by  mark¬ 
ing  code  cis either  central,  interesting,  or  ignored.  Cen¬ 
tral  components  are  the  primary  focus  of  the  descrip¬ 
tion,  and  consist  of  all  code  satisfying  a  particular  cri¬ 
terion,  such  cis  code  matching  steps  a  scenario.  Com¬ 
ponents  are  designated  interesting  because  of  their  in¬ 
teractions  with  the  central  components.  Ignored  com¬ 
ponents  are  removed  because  they  can  be  explicitly 
filtered  out,  or  because  they  are  not  found  to  be  cen¬ 
tral  or  interesting.  The  following  are  some  criteria  that 
have  been  found  in  pencil-and-paper  studies  to  be  use¬ 
ful  in  determining  code  to  be  central  or  ignored;  others 
are  expected  to  be  identified. 

•  Exceptional  case  handling.  In  some  descriptions, 
such  as  the  one  shown  in  Figure  3,  code  for  han¬ 
dling  exceptional  cases  is  unimportant;  in  other 
cases  (such  as  when  fixing  bugs)  such  code  may 
be  central.  Either  way,  such  can  be  detected  in 
a  significant  number  of  cases.  Ada  has  explicit 
constructs  for  raising  exceptions.  Additionally,  if 
a  variable  is  being  used  as  an  error  flag,  tests  of 
the  variable  can  be  detected  automatically,  and 
branches  of  the  code  marked  accordingly. 

•  Code  that  sets  and/or  reads  a  variable  or  at¬ 
tribute. 

•  Code  that  assigns  an  attribute  or  variable  to  a 
specific  value,  or  checks  that  the  attribute  or  vari¬ 
able  has  a  specific  value. 

Once  components  are  found  be  central,  surround¬ 
ing  code  is  often  marked  as  interesting  and  is  therefore 
included.  The  motivation  for  this  is  to  provide  some 
context  so  that  the  focussed  operations  are  more  eas¬ 
ily  understood.  For  example,  if  a  routine  sets  an  in¬ 
teresting  attribute  of  an  object,  assignments  to  other 
attributes  of  the  same  object  in  the  same  procedure 
may  be  included  as  well. 

The  transformation  facilities  in  Refine  are  to  be  em¬ 
ployed  to  implement  this  filtering  process.  The  trans¬ 
formation  pattern  language  can  be  used  in  some  cases 
to  detect  the  code  that  is  of  interest.  In  other  cases, 
transformations  can  be  used  to  perform  simple  ex¬ 
pression  simplifications  in  order  to  facilitate  pattern 
matching.  A  tool  that  could  recognize  all  instances 
of  potentially  interesting  code  would  require  a  more 
powerful  simplifier  than  is  envisioned  for  I-Doc.  In¬ 
stead,  I-Doc  will  either  inform  the  reader  when  the 
view  being  piesented  is  possibly  incomplete,  or  sim¬ 
ply  provide  a  view  that  is  somewhat  broader  than  is 
strictly  nece.ssary. 
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10  Presentation  Layout 

Once  relevant  information  has  been  identified,  I- 
Doc  must  determine  what  media  to  use  to  present  the 
information,  and  how  to  present  the  information  using 
those  media.  In  the  long  term,  we  hope  to  be  able  to 
employ  a  variety  of  presentation  media,  including  mix¬ 
tures  of  text  and  diagrams.  At  first,  .hough,  the  focus 
will  be  on  natural  language  generation.  Such  gener¬ 
ated  text  may  be  supplemented  with  diagrams  if  such 
diagrams  have  already  been  constructed  and  are  avail¬ 
able,  as  is  currently  the  case  with  CASE-generated 
documents. 

The  overall  structure  of  each  description  that  is  gen¬ 
erated  will  be  determined  by  a  presentation  template, 
a  library  of  which  will  be  included  in  I-Doc’s  knowl¬ 
edge  base.  These  templates  provide  standard  ways  of 
presenting  different  aspects  of  a  system.  Each  tem¬ 
plate  will  have  a  set  of  selection  criteria,  based  upon 
the  amount  of  various  types  of  information  that  is  to 
be  presented,  and  the  assumed  level  of  expertise  of 
the  user.  Slots  within  the  template  are  then  filled  out 
using  a  natural  language  generator. 

The  natural  language  generator  consists  of  two 
components.  The  basic  generation  work  is  performed 
by  a  Functional  Unification  Generator,  which  can 
construct  arbitrary  sentences  from  attribute- value  de¬ 
scriptions  of  the  content  to  be  expressed.  A  second 
phrase  selection  component  constructs  attribute-value 
patterns  in  a  form  suitable  for  input  to  the  Functional 
Unification  Generator,  according  to  directives  con¬ 
tained  within  the  system  description  templates.  This 
architecture  was  used  successfully  in  ARIES  and  the 
KBSA  Concept  Demonstration  [11]  to  generate  text 
descriptions  rapidly.  Functional  Unification  Gram¬ 
mar  has  been  evaluated  against  competing  methods 
for  natural  language  generation,  and  ha.*-  b<‘<=n  found 
to  be  both  flexible  and  efficient  [9]. 


11  Presentation  Delivery 

As  indicated  in  Section  3,  Mosaic  is  being  employed 
as  the  hypertext  delivery  mechanism  for  I-Doc.  Al¬ 
though  there  are  various  commercial  products  avail¬ 
able  that  provide  hypertext  capability.  Mosaic  has  a 
set  of  features  that  make  it  particularly  suitable  to 
dynamic  documentation  generation. 

•  Mosaic  provides  interfaces  to  external  programs, 
so  that  the  user  buttoning  on  a  hypertext  link  can 
cause  a  program  to  be  invoked. 


•  It  interprets  hypertext  formatting  commands  dy¬ 
namically,  as  needed;  this  makes  it  possible  to 
construct  a  hypertext  document  dynamically  and 
display  it. 

•  It  runs  on  a  variety  of  platforms,  including  X 
Windows,  Microsoft  Windows,  and  Macintosh. 

•  It  is  public  domaun,  auid  the  source  code  is  read¬ 
ily  available,  making  it  possible  to  customize  the 
system  for  use  within  I-Doc. 


12  Further  Plans  and  Prospectus 

The  I-Doc  project  has  just  begun;  it  is  expected  to 
continue  for  another  three  years.  The  project  plans  to 
make  available  intermediate  versions  of  the  system  on 
a  periodic  basis.  In  the  near  term,  the  capability  will 
be  demonstrated  on  a  military  application  selected  in 
collaboration  with  the  US  Air  Force’s  Wright  Labora¬ 
tory. 

If  resources  are  available,  I-Doc  will  be  extended 
so  that  it  can  describe  systems  to  other  classes  of 
users  besides  maintainers,  such  as  clients  and  end- 
users.  This  will  further  enhance  'he  value  of  documen¬ 
tation  generation  technology  tc  -ftware  development 
projects. 

In  the  long  run,  dynamic  generation  of  documen¬ 
tation  will  prove  successful  if  the  perceived  benefits 
outweigh  the  costs  of  additional  design  capture.  The 
benefits  will  be  particularly  apparent  in  projects  that 
undergo  periodic  design  reviews,  since  automated  doc¬ 
umentation  should  prove  to  be  a  great  benefit  to  the 
design  review  process.  Reengineering  and  program 
synthesis  technology  will  gradually  reduce  the  amount 
of  interaction  between  developers  and  I-Doc  necessary 
for  design  capture.  In  the  near  term,  making  enhanced 
hypertext  capabilities  available  to  developers  is  likely 
to  bring  benefits  in  itself.  Hypertext  is  gradually  being 
adopted  in  software  development  practice,  and  the  ac¬ 
tivities  of  the  I-Doc  project  will  aim  to  help  accelerate 
this  trend. 
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Abstract 

This  paper  presents  a  case  study  of  domain 
q)eciiic  software  reuse,  also  called  vertical  reuse,  where 
both  the  architecture  and  individual  classes  are  reused. 
The  applications  domain  we  deal  with  is  the  domain  of 
visual  interactive  software  tools.  The  paper  describes  the 
architecture  itself,  the  reverse  engineering  process  by 
which  it  was  obtained,  and  the  forward  engineering 
process  by  which  it  was  reused.  The  architecture  is  called 
orthogonal  architecture,  and  it  consists  of  classes 
organized  into  layers  and  threads. 
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1.  Introduction 

The  reuse  in  vertical  domain  is  characterized  by 
a  reuse  of  the  whole  software  architecture,  which  is  being 
adjusted  to  satisfy  a  new  set  of  requirements.  In  this 
respect,  reuse  in  vertical  domain,  software  evolution,  and 
perfective  maintenance  overlap  to  a  large  extent  In  our 
case  study,  we  studied  a  process  of  reuse  in  vertical 
domain  of  visual  interactive  software  tools.  We 
developed  an  orthogonal  architecture  for  that  domain, 
which  is  particularly  suitable  for  vertical  reuse. 
Evolutionary  domain  life  cycle  was  the  process  used  in 


the  case  study.  The  architecture  was  obtained  through 
reengineering  of  an  earlier  tool. 

The  major  difference  between  the  evolutionary 
domain  life  cycle  and  the  waterfall  model  are  the 
activities  that  span  the  whole  aj^lication  domain.  These 
are  domain  analysis  [21,  22,  23,  24],  and  domain  design 
[8].  Domain  analysis  examines  the  requirements  of  a 
family  of  systems.  The  end  result  of  the  domain  analysis 
phase  is  a  document  called  the  domain  specifications. 
Domain  design  includes  defmition  of  the  data  structures, 
file  fexmats,  and  important  algorithm  descriptions  for  a 
specifle  domain  of  applications.  The  final  pnxluct  of  the 
domain  analysis  phase  is  a  domain  architecture  that 
describes  the  architecture  of  a  family  of  systems.  This 
domain  architecture  reflects  the  design  of  all  software 
systems  that  constitute  the  domain. 

We  studied  the  domain  of  visual  interactive 
tools.  Our  experience  in  building  these  systems  dates  to 
1986  when  we  developed  a  prototype  of  VIC;  Visual 
Interactive  C  [25,  26].  Other  visual  interactive  editors 
devel(^)ed  in  our  group  were  VIFOR  [26],  and  EDG 
(described  here  -  see  Appendix  A).  We  defined  a  set  of 
classes  -  some  of  them  general,  others  specialized  -  and 
all  interactions  among  them.  In  this  model  classes  may 
evolve,  but  the  overall  architecture  remains  unchanged 

2.  Previous  Work 

The  concept  of  "program  families"  originates 
with  Pamas  [16,  17,  18,  19,  20].  FTogram  families  are 
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defined  as  sets  of  programs  whose  common  properties  are 
so  extensive  that  it  becomes  advantageous  to  study  the 
common  properties  of  these  programs  before  analyzing 
individual  differences.  Since  then,  other  researchers  such 
as  Neighbors  [13, 14,  IS],  Barstow  [6],  Lubars  [10. 11. 
12],  Bailin  [4,  S],  Kant  [9],  Arango  [1, 2, 3],  and  Prieto- 
Diaz  [21]  have  associated  the  concept  of  a  ’family  of 
programs”  with  the  idea  of  ^rplication  domain. 

Barstow  [6]  investigated  the  issue  of  domain 
qrecific  automatic  programming  in  the  context  of  the 
two  application  domains,  both  related  to  oil  well 
logging.  Similar  studies  have  been  concluded  by  other 
researchers  for  other  domains  of  knowledge.  Examples  of 
such  studies  includes  those  of  Kant  et  al.  [9],  Bailin  [4. 
S],  and  Duim  [7].  Our  work,  is  in  a  sense,  related  to 
these  works  because  we  also  performed  domain  analysis 
and  domain  design  studies  within  a  specific  domain. 

Prieto-Diaz  [21]  proposed  a  methodology  for 
domain  analysis.  He  described  three  distinct  steps:  pre¬ 
domain  analysis,  domain  analysis,  and  pos-domain 
analysis.  Pre-domain  analysis  comivised  the  definition 
and  scope  of  the  domain,  the  identification  of  sources  of 
knowledge,  and  mformation  about  the  domain. 

Barstow  and  Kant  [6,  9],  among  others, 
performed  domain  analysis  and  domain  design  studies  that 
led  to  code  generation  within  well  defined  domains  of 
qrplications.  These  domains,  quite  different  in  nature, 
possessed  a  high  degree  of  cohesiveness,  a  well 
understood  variability,  and  a  low  complexity  level.  Code 
generation  is  advocated  for  such  cases.  For  more 
complex  systems  with  a  higher  degree  of  variability  and  a 
lower  degree  of  cohesiveness,  code  generation  is  not 
recommended.  Instead,  a  knowledge-base  that  captures 
important  aspects  that  characterizes  the  family  of  systems 
can  facilitate  its  development.  The  knowledge-base 
consists  of  :  (1)  well  defined  code  components.  (2)  a 
generalized  systems'  architecture  for  the  family  of 
systems  that  constitute  the  domain,  and  (3)  a  set  of 
domain  rules  (i.e.,  domain  knowledge).  The  domain 
knowledge  includes  relationships  among  different 
components  that  make  up  the  system. 

3.  Orthogonal  architecture  for 
interactive  software  tools. 

We  performed  the  domain  analysis  and  design 
study  for  an  experimental  and  reusable  software  system 
known  as  ’The  Environment  for  Decomposition  and 
Generalization  (EDG  -  pronounced  "edge”).  EDG 


supports  the  OODG  methodology  of  design  [22].  The 
EDG  system  has  an  architecture,  which  can  be  easily 
adapted  for  other  systems  within  the  domain  of  visual 
interactive  editors.  Our  goal  was  not  to  show  that  the 
architecture  of  EDG  is  the  best  possible,  but  we  were 
satisfied  with  an  architecture  that  possessed  the  following 
attributes: 

(1)  reusable:  The  architecture  can  be 
transformed  to  perform  the  functions 
expected  (tf  other  visual  interactive  tools. 

(2)  Understandable:  The  architecture  is 
simple.  Domain  users  who  have  a 
minimum  knowledge  of  visual  interactive 
tools  are  capable  of  understanding  what 
each  class  does  and  what  the  roles  are 
within  the  domain,  by  reading  the  classes 
specifications. 

(3)  adaptable:  The  classes  are  capable  of 
evolving.  This  capability  involves  the 
mechanics  of  changing  specific  members 
within  a  class  while  it  still  iveserves  the 
structural  relationships  between  classes 
(i.e.,  maintaining  the  overall  generic 
architecture  for  the  whole  system). 

In  order  to  accomplish  these  objectives,  EDG 
system  is  partitioned  both  hwizontally  and  vertically. 
Horizontal  partitioning  represents  layers  or  levels  of 
abstraction.  On  layer  1,  Unction  ’main”  controls  the 
application.  Layer  2  implements  the  menu  interface,  and 
triggers  callback  functions  chosen  fiom  the  menu.  Layer 
3  encapsulates  the  interfaces  for  all  (iterations  supported 
by  EDG.  At  this  layer,  we  defined  a  standard  dialogue  for 
each  operation.  Fcm-  example,  if  the  operation  is  "Save 
As,”  we  know  that  there  is  a  standard  dialogue  between 
the  user  and  the  system.  First,  after  this  operation  is 
selected,  a  panel  is  presented  and  the  user  is  asked  to  enter 
a  file  name.  Then  if  the  file  name  is  valid  and  the  user 
hits  the  return  key  or  presses  the  "ok”  button,  the  file  is 
saved.  In  any  other  sequence  of  events,  this  "Save  As" 
operation  is  aborted. 

The  actual  functionality  of  the  operations  was 
defined  at  layer  4.  Some  of  the  classes  which  were 
defined  at  this  layer  include:  a  class  to  draw  graphs  on 
canvas,  a  class  with  algorithms  for  representing  graphs,  a 
class  with  functions  to  scan  C-H-  source  C(xle  and  to 
extract  relationships  between  objects,  etc. 


Ftiully,  most  operations  in  EDG  need  services 
of  a  database.  The  fifth  and  sixth  layers  perform  these 
services.  Layer  S  encapsulates  the  q)eciric  data  ntodel 
and  database  interface,  while  layer  6  is  the  database  itself. 
They  are  both  custom  made  specifically  for  EDG. 
Database  is  resident  in  the  main  memory,  and  is  stored 
on  the  disk  in  a  flat  file.  The  efficiency  was  not  of  a 
concern,  and  hence  relatively  simple  data  representations 
and  search  algorilfams  were  emph^ed. 

Vertical  partitioning  of  the  EDG  system 
involves  the  division  of  the  system  into  threads  that  are 
orthogonal  to  layers.  Threads  are  sets  of  classes  related 
to  each  other  by  relationship  of  "use*.  Threads  in  our 
architecture  are  largely  independent  of  each  other,  with 
very  few  classes  from  one  thread  using  services  of  the 
classes  in  other  thread.  Only  layers  1  (highest)  and  6 
(lowest)  are  shared  among  the  threads.  The  threads  of 
EDG  are: 

(1)  Project,  supporting  commands  that 
operate  on  entire  projects  and  its  files. 

(2)  Graph,  which  supports  gnyih  display 
and  editing. 

(3)  Views,  which  selects  information  to  be 
presented  in  a  window. 

(4)  Browser,  which  supports  navigation 
through  the  database. 

(5)  Analyzer,  which  extracts  architectural 
information  from  programs. 

(6)  Run.  which  interfaces  EDG  with  other 
tools  (compilers,  debuggers,  etc.). 

These  six  threads  are  truly  universal  within  the 
domain  of  visual  interactive  software  environments. 
Examples  of  software  development  environments  that  use 
one  or  more  of  these  threads  iiKlude:  Software  through 
Pictures.  Powertools  from  Iconix.  and  Teamwork  from 
Cadre. 

Taken  together,  these  two  orthogonal  system 
partitionings  provide  an  interesting  map  of  the  EDG 
system.  For  each  specific  layer  and  thr^.  there  is  at 
least  one  class  implementing  the  required  functionality. 
This  orthogonality  substantially  improves  the 
understanding  of  software,  and  therefore  facilitates  its 
reuse. 


4.  Reverse  engineering:  Creating 
EDG. 

EDG  was  created  from  earlier  projects,  which  had 
the  same  or  overlapping  functionalities  but  did  not 
possess  the  orthogonal  architecture  described  above.  The 
reverse  engineering  was  applied  to  a  code  written  partially 
in  C  and  partially  in  C-M-.  The  effort  involved  domain 
analysis,  which  had  to  be  done  for  the  whole  domain  of 
visual  interactive  tools.  That  was  followed  by  a  domain 
design,  where  the  threads,  layers,  and  individu^  classes  of 
EDG  were  defined.  Finally  the  existing  code  was  analyzed 
and  reengineered  into  the  new  code,  fitting  the  new 
architecture.  The  classes  of  the  new  code  fall  into  the 
following  categories:  Classes  transfered  from  the 
previous  projects  with  modifications,  and  classes  written 
from  scratch.  There  were  no  classes  which  could  be 
reused  from  the  previous  i»oject  without  any  change. 

(X  the  total  1 1.5S8  lines  of  the  code  of  EDG,  total 
of  4^60  belongs  to  the  classes  written  frx>m  the  scratch. 
These  classes  belong  mostly  to  top  and  bottom  layers, 
where  the  impact  of  the  new  architecture  was  most  felt 
The  rest  belonged  to  the  classes  that  were  modified  to  a 
larger  or  lesser  degree.  As  far  as  the  effort  is  concerned, 
the  effort  to  reengineer  old  code  into  EDG  was 
approximately  40%  of  the  estimated  effort  it  would  take 
to  implement  EDG  from  scratch.  For  more  detailed 
numbm,  see  [29]. 


5.  Forward  engineering:  From 
EDG  to  EDFD. 

The  reusability  of  orthogonal  architecture  of 
EDG  was  tested  in  a  case  study  where  it  was  reused  for  a 
new  visual  interactive  tool:  Environment  for  Data  Flow 
Diagrams  (EDFD)  which  partially  supports  the 
methodology  described  in  [27].  A  more  detailed 
description  of  both  EDG  and  EDFD  is  included  in 
Appendices  A  and  B.  As  the  first  step,  we  identified  the 
set  of  operations  required  by  the  new  system,  and 
organized  them  into  a  top-menu. 

Via  the  second  step,  "assign  operations,"  we 
mapped  the  set  of  operations  identified  in  the  previous 
step  on  the  pre-existing  set  of  threads.  For  example,  the 
operation  "Hide  Object"  is  mapped  to  thread  "graph." 
After  this  phase,  when  all  operations  for  a  new  system 
are  assigned  to  a  thread,  all  unnecessary  threads  are 


fcmoved  from  the  system.  In  our  case  EDFD  required 
three  threads:  the  Data-How  Editor,  the  Data  Dictionary 
Editor,  and  the  Defining  Functions  Editor.  The  first 
thread  was  mapped  on  Graph  Editor  thread  of  EDG,  while 
both  remaining  threads  were  mapped  on  Project  thread  of 
EDG.  Hence  the  project  thre^  is  modified  in  two 
different  ways,  once  to  serve  as  Data  Dictionary  thread 
and  a  second  time  as  Defining  Functions  thread.  The  rest 
of  the  threads  of  EDG  were  no  longer  needed,  and 
therefore  were  discarded. 

The  next  step  consists  of  modification  of  the 
classes  in  the  threads.  When  modifying  a  class,  we 
miqjped  the  new  specifications  of  the  class  on  the  old 
one,  and  compart  the  new  requirements  with  the 
existing  code.  Fw  "Add  an  Object,"  we  identified  the  set 
of  functions  in  "gnq>h"  associated  with  that  operation. 
First  we  searched  the  interface  of  class  "graph"  and 
checked  whether  or  not  that  function  exists.  If  so,  we 
considered  that  function  and  all  functions  called  by  it  f(» 
possible  modifications. 

Function  nuxlifications  were  effected  "top-down" 
until  all  of  the  selected  functions  were  completely 
defined.  Three  different  scenarios  have  occurred: 

(1)  The  function  can  be  reused  "as  is."  No 
changes  were  required. 

<2)  The  function  needs  to  be  "adjusted"  to 
conform  to  a  new  set  of  requirements. 

Here,  we  would  isolate  the  portion  of  the 
function  which  can  be  reused  without  other 
changes,  and  then  we  would  add  code  to 
perform  the  new  required  functionality. 

(3)  The  function  does  not  exist  Here,  we 
implement  the  new  function  to 
acconuiradate  a  new  set  of  requirements. 

The  whole  architecture  was  scatuied  through  this 
process,  and  adapted  for  the  new  set  of  requirements. 

The  following  statistics  on  the  extent  of 
modifications  were  gathoed: 

Layer  1  Reused  without  adaptations. 

Layer  2  Very  easy  adaptation 

74.2  %  of  c(^e  was  used  without 
ad^rtations 


25.8  %  of  code  was  used  with  some 
modification. 

Unnecessary  to  inuoduce  new 
functions. 

Layer  3  Very  easy  adaptatioa 

SM.6  %  of  c(^  was  used  without 
adaptations. 

4.5  %  of  code  was  used  with 
adaptations. 

0.9  %  of  code  was  implemented  fiom 
"scratch" 

Unnecessary  to  introduce  new 
functions. 

Layer  4  Refers  to  "grtph.editor"  (our  worst 
case  scenario). 

Very  easy  adaptation  for  already 
existing  functions. 

73.5  %  of  code  was  used  without 
adaptations. 

2.2  %  of  code  was  used  with 
adiptations. 

24.3  %  of  code  was  implemented 
from  "scratch." 

Necessary  to  introduce  new 
functions. 

Layer  5  Very  easy  to  modify. 

88.3  %  of  code  was  used  without 
adaptations. 

3.8  %  of  code  was  used  with 
adaptations. 

7.9  %  of  code  was  implemented  from 
"scratch." 

Unnecessary  to  introduce  new 
functions. 

Layer  6  Very  easy  to  modify. 

78.4  %  of  code  was  used  without 
adaptations 

2.3  %  of  code  was  used  with 
modifications 

19.3  %  of  code  was  implemented 
from  "scratch." 

Unnecessary  to  introduce  new 
functions 

Analyzing  this  data,  the  classes  requiring  least 
amount  of  wmk  were  either  high  in  the  class  hierarchy 
(layer  one)  or  low  (layers  five  and  six).  Classes  in  the 
middle  of  the  hierarchy  requited  more  work. 
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The  total  size  of  EDFD  is  4,606  lines.  The  total 
effort  of  the  reuse  represents  37%  of  the  estimated  effort 
to  build  the  system  from  scratch.  For  more  detailed 
numbers,  see  [29]. 


6.  Conclusions. 

We  found  that  building  the  "perfect  architecture" 
is  a  step  by  step  process,  very  similar  to  exploratory 
programming.  As  we  coped  with  the  new  requirements 
of  EDFD,  we  were  able  to  perfect  the  existing 
architecture  of  EDG  and  improve  its  classes,  thus  making 
the  architecture  mote  univer^  and  adaptable.  We  believe 
that  the  architecture  of  EDG  developed  in  the  case  study 
is  suitable  fw  most  visual  interactive  tools,  and  future 
reuse  and  retroactive  improvement  will  make  it  even 
morereusaUe. 

We  developed  a  process  of  three  steps  to  adapt  a 
generic  orthogonal  architecture  to  perform  a  new  set  of 
requirements.  Please  note  that  the  process  is  domain 
independrot  Hence  we  are  expecting  to  be  able  to  use 
this  process  in  other  domains  as  weU. 

We  conjecture  that  the  orthogonal  architecture 
can  be  developed  for  domains  other  than  visual  interactive 
tools.  We  did  some  preliminary  studies  of  nucleus  of 
t^reradng  system,  and  found  the  same  methodology  and 
architecture  ideas  i^licable  there. 

In  our  case  study,  we  found  encouraging 
{woductivity  figures.  To  reengineer  an  unconstrained 
architecture  into  an  orthogonal  one,  we  spent 
approximately  40%  of  the  time  compared  to 
implementation  from  the  scratch.  Adapting  this 
architecture  to  a  new  set  of  requirements,  we  spent 
qrproximately  37%  of  the  time  compared  to  building  the 
system  from  the  scratch.  Hence  to  reengineer  a  system 
into  orthogonal  architecture,  and  then  forward  engineer  it 
to  a  new  set  of  requirements,  was  cost  effective  already 
on  the  first  system,  where  it  cost  77%  of  the  estimated 
original  cost  We  conjecture  that  each  subsequent  system 
within  that  domain  should  cost  again  approximately  40% 
of  the  (xiginal  cost  to  build.  We  find  these  preliminary 
figures  to  be  very  encouraging,  and  hope  to  verify  them 
by  studies  in  other  domains. 
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Appendix  A 

EDG  Requirements 

EDG  is  an  experimental  software  tool  used  in 
the  develtqxnent  of  C++  programs:  it  enforces  the  use  of 
the  Object-Oriented  Decomposition  and  Generalization 
methodology  (27).  With  EDG,  C++  programs  can  be 
di^layed  and  edited  in  two  forms:  class  diagrams  and 
code.  A  more  detailed  di-scripdon  of  the  EDG  follows. 

Six  major  threads  constitute  EDG: 

•  the  project  manager  thread 

•  the  gn^  editor  thread 

•  the  view  manager  thread 

•  the  code  analyzer  thread 

•  the  browser  duead 

•  the  code  generator  thread 

The  Project  Manager 

The  project  manager  subsystem  implements  the 
commands  which  cerate  on  entire  projects  (project 
descriptions  and  files).  A  brief  description  of  each 
commands  follows. 

Open:  Opens  an  existing  project  or 

starts  a  new  project 

Close:  Resumes  the  cunently  loaded 

project 

AddFik:  Adds  a  new  file  to  the  project 

description. 

Remove  File:  Removes  a  file  from  the  project 
descriptioit 

Save:  Updates  the  current  project 

description  in  a  file. 

Save  As;  Updates  the  current  project 

description  in  a  new  file 
"  possibly  the  same. 

The  Graph  Editor 


Graph  operations  are  used  to  build  and 
manipulate  the  architecture  of  a  program.  By  using  these 
operations,  graphs  may  be  created,  deleted,  and  modified. 
Specially  designed  canvas  windows  are  used  to  display 
these  graphs.  In  association  with  the  graph  editor,  we 
implemented  Sugiyama's  algorithm  to  display 
hierarchical  graphs. 

New;  Starts  a  new  graph  editor 

window  and  initializes  the 
graph  database. 

Add;  Adds  objects  to  the  graph 

database.  In  EDG,  objects  can 
be  of  two  kinds:  classes  and 
relationships  between  classes 
use  or  inheritance 
relationships. 

Delete:  Removes  objects  form  the 

/lalahngf; 

Rename:  Changes  the  name  of  an 

object 

Select  Changes  the  active  object 

Color  Changes  colors  of  objects. 

Hide:  Removes  an  object  and 

associated  relationships  fiom  a 
graph.  These  objects  are  not 
displayed,  but  are  kept  in  the 
databtee. 

Move:  Changes  object  positions  in  a 

displayed  grtqih. 

Load  Graph:  Reads  and  displays  the 
mformation  from  a  frle  which 
contains  the  description  of  a 
graph. 

Save  Graph:  Writes  the  graph  description 
in  a  named  file  for  future 
retrieval. 

The  View  Manager 

"View"  operations  are  used  to  control  and 
display  the  information  in  text  and  canvas  windows. 
However,  "view"  operations  do  not  affect  data  itself. 
The  view  manager  implements  commands  which  control 
the  presentation  of  the  information  within  a  canvas 
window  and/or  a  text  window. 

New:  Creates  a  new  instance  for  the 

view  manager. 

Show  File;  Displays  a  named  file  in  a  text 
window.  This  file  cannot  be 


modified  using  the  view 
manager. 

Show  Graph:  Di^lays  a  named  graph  on  a 
canvas.  This  graph  cannot 
be  modified  using  the  view 
manager. 

Text  Editor  Starts  a  text  window  that  can 
be  used  for  documentation. 

Print  File:  Sends  a  text  file  to  the  i»inter. 

Snapshot:  Retrieves  information  of  some 

portion  of  a  displayed  graph, 
saves  it  in  a  named  file,  and 
displays  that  portion 
enlarged. 


The  Code  Analyzer 

This  thread  contains  commands  used  in  the 
extraction  of  architectural  and  inocedural  information  of 
C++  programs.  The  analyzer  uses  a  parser  to  scan  C++ 
code  files  and  identify  C++  code  components.  In 
addition,  the  analyzer  searches  for  relationships  between 
C++  code  components.  Examples  of  these  code 
components  include  classes  and  functions.  While 
examples  of  relationships  between  code  components 
include:  "use"  and  "inheritance." 

The  menu  interface  to  this  thread  includes  the 
following  operadons: 

New:  Create  a  new  instance  of  the  analyzer. 

Initialize  the  analyzer  database. 

Analyze:  Start  the  interface  template  for  the 
user  to  type  the  ftlename(s).  activate 
the  parser,  and  populate  the  analyzer 
database. 

Display:  Invoke  the  Sugiyama's  algorithm  to 
compute  object  positions  and  display 
the  graph. 

The  Browser 

The  browser,  helps  programmers  to  understand 
object  oriented  software  systems  written  in  C-H-.  To 
achieve  this  goal,  this  thread  provides  information  about 
the  set  of  classes  and  files  comprising  the  system  and 
relationships  among  them.  With  the  set  of  operations 
described  below,  users  of  this  thread  may  "browse” 
through  the  system  based  on  relationships  among 
classes,  files,  and  identifiers.  Useful  cross-reference 


information  about  class  methods  and  variables  may  also 
be  found. 

New:  Create  a  new  instance  of  the 

browser  and  initialize  the  browser 
database. 

Show  Class:  Reads  all  classes  from  the 
database  and  shows  all 
classes  in  a  panel.  When  any  one 
of  the  project  classes  is  selected, 
its  declaration  file  is  opened 
and  displayed  on  a  text-window. 
The  selected  class  becomes  the 
"active  class." 

Select  Class:  Changes  the  current  "active 
class.”  The  selected  class  text  is 
validated  against  class  names 
stored  in  the  database.  If  the  class 
name  is  valid,  then  the  active- 
class  is  changed.  Otherwise,  an 
enor  message  box  is  shown 
to  the  user. 

Class  Info:  Returns  information  about  the 
"active  class."  This  information 
is  shown  on  the  information 
panel  and  includes: 

-class  name 

-  "uses”  relationship 
-"used-by"  relationship 

-  "is_a_subclass"  relationship 

-  is_a_superclass”  relationship 

-  inheritance  path 
-header  file 
-number  of  "use" 

relationships 

-  number  of  "used-by" 

relationships 

Declarations:  A  text  window  is  shown,  which 
opens  the  declaration  file  for  the 
active  class.  To  select  the  active 
class  one  may  use  the  mouse  and 
highlight  a  class  identifier  or  may 
highlight  a  function  identifier. 
Either  way,  the  declaration  Hie  for 
that  class  or  function  is  shown  in 
a  text  window. 

Defmitions:  Similar  to  "declarations."  Instead 
of  displaying  the  header  Hie,  it 
displays  the  definition  Hie  foi  the 
selected  identiHer. 

Show  Info:  This  operation  is  used  to  give  a 
summary  of  information  about  an 
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identifier.  Identifiers  may  be 
classes  or  functions.  The 
following  information  related  to 
identifiers  is  preserved  in  the 
database  and  displayed  when  this 
operation  is  selected: 

-  identifier_name 

-  is.overioaded 

-  variabie_or_function 
"type 

-  declaradon.file 
~  de£inition_fUe 

-  parameter_types 

-  declaration_file_index 

-  definition_£ile_index 
--  comments 

-  calLteladonships 

-  called_finom_reladonsh4)s 

-  lotal_niimber_of_calls 

-  number_of_calls_fiom 

The  Run  Project 

The  run  project  contains  the  set  of  commands 
used  to  interface  EDG  with  other  tools  (eg;  compilers, 
debuggers,  and  code  generators). 

Generate:  Generates  C-t-t-  code  skeletons  fra 
classes. 

Make :  Generates  a  Makefile  fra  an  entire 

project  based  on  project 

descriptions. 

Compile;  Executes  the  make  file. 

Execute:  Executes  the  object  code  for  the 
project. 

Debug;  Invokes  a  symbolic  debugger  ~  Fdr 
example  dbxtool. 


Appendix  B 

EDFD  Requirements 

EDFD  supports: 

1.  Identifying  input  and  output  values. 

2.  Using  data  Flow  Diagrams  as  needed  to  show 
functional  dependencies. 

3.  Describing  what  each  function  does. 


4.  Identifying  consuaints. 

5.  Specifying  optimization  criteria. 

To  provide  support  for  1  and  2,  we  provide  a 
data  flow  diagram  editor.  To  provide  support  to  1  and  4. 
we  provide  a  data  dictionary  editor.  To  support  3  and  S 
we  provide  a  describing  functions  editor.  We  continue 
this  discussion  with  a  description  of  the  operations 
required  to  support  each  one  of  die  editors. 

The  Data  Flow  Diagram  Editor 

Data  elements  flow  firom  one  process  node  to 
anotherpiocessnode  where  they  are  processed.  Dataflow 
diagrams  consist  of  four  graphical  components: 
Processes,  data  flows,  data  stores,  and  actors.  Processes 
denote  fimctions  of  the  system.  Data  Flows  represent 
data  elements  flowing  between  process  nodes.  Data 
stores  represent  places  where  data  elements  are  stored. 
And.  actors  are  independent  objects  that  produce  and 
consume  values  -  source  or  sink  of  data. 

Data  flow  diagrams  depict  a  system  from  the 
data's  point  of  view.  Using  data  flow  diagrams,  the 
analyst  is  able  to  show  how  data  flows  in  a  system,  how 
data  is  transformed  by  the  system,  and  where  to  store  data 
in  the  system.  To  help  us  with  this  data  flow 
description,  we  provide  the  following  data  flow  diagram 
editor. 

New:  Starts  a  new  graph  editor  window 

and  initializes  the  data  flow 
diagram  database. 

Add;  Adds  objects  to  the  grt^h  database. 

In  EDFD,  objects  can  be  of  four 
kinds:  processes,  data  stores  or 
file  objects,  and  actor  objects.  In 
addition,  there  are  association 
objects:  data  flow  between 

processes,  data  flow  that  results  in 
a  data  store,  and  control  flow.  We 
should  also  iMovide  the  possibility 
of  having  composition  of  data 
values,  decomposition  of  data 
values,  diqrlication  of  data  values, 
and  access  and  update  of  data 
values. 

Delete:  Removes  objects  from  the 
datdrase. 

Rename:  Changes  the  name  of  an  object 

Select  Changes  the  active  object 
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Colon  Changes  colors  of  objects  in  a 
specific  data  flow  diagtam. 

Hide:  Removes  an  object  and  associated 

associations  from  a  data  flow 
diagram.  These  objects  are  not 
displayed,  but  are  kept  in  the 
database. 

Move:  Changes  object  positions  in  a 
di^yedgRgth. 

Load  Graph:  Reads  the  information 

from  a  file  which  contains 
the  descr^tion  of  a  data  flow 
diagram,  and  displays  it 

Save  Graph:  Writes  the  data  flow 

diagram  description  in  a  named  file 
for  future  retrieval. 

The  Data  Dictionary  Editor 

Data  flow  diagrams  are  documented  by  a  data 
dictionary.  The  data  dictionary  defmes  the  meaning  of 
each  data  flow,  artd  data  elements.  To  support  this 
activity  we  provide  the  following  data  dictionary  editor. 

New:  Starts  a  new  data  dictionary. 

Add  Entry:  Inserts  a  new  object  in  the 

data  dictionary  daobase. 

Delete  Entry:  Removes  an  existing  object 

from  the  data  dictionary 
database. 

Show  DD:  Displays  the  data  dictionary 

on  a  text  window. 

LoadDD:  Copies  a  specified  data 

dictionary  into  main 
storage. 

SaveDD:  Stores  an  existing  data 

dictionary  in  a  file  for 
future  retrieval. 

Print  DD:  Generates  a  file  with  a  copy 

of  an  existing  data 
dictionary  and  sends  that  file 
to  the  printer. 

The  Describing  Functions  Editor 

Ultimately,  leaf  processes  in  the  data  flow 
diagrams  must  be  specified  directly  as  operations.  The 
use  of  structured  English,  pseudocode,  and/w  any  other 
form  of  textual  documentation  is  recommended.  To 
support  this  activity,  we  provide  the  following  describing 
fii^ons  editor 


New:  Starts  a  new  describing 

functions  instance. 

Add  Function:  Inserts  a  new  function 
description  in  the  database. 

Delete  Function:  Removes  an  existing 
function  from  the  database. 

Show  Functions:  Displays  a  specific 
function  in  read-only  text 
window. 

Load  Functions:  Copies  a  specified  set  of 
function  descriptions  into 
tnainsbxage. 

Save  Function:  Stores  an  existing  set  of 
function  descriptions  in  a 
file(s)  for  future  retrieval. 

Print  Function:  Generate  a  file  with  a  copy 
of  an  existing  function 
description  and  sends  that 
file  to  the  printer. 

Print  All:  Generate  a  file  with  a  cc^y 

of  all  function 
descriptions  existing  in  the 
datab^,  sorts  them,  and 
sends  that  file  to  the 
default  system  printer. 
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As  existing  systems  are  changed  to  keep  up  with 
changing  needs,  their  structure  becomes  less  coherent  and 
cohesive  making  it  difficult  and  increasingly  expensive  to 
make  further  changes.  In  addition,  as  the  legacy  of 
complex  automated  systems  grows  while  the  available 
resources  for  upgrading  or  replacing  them  shrink,  there  is 
increasing  concern  for  Hnding  ways  to  leverage  these 
systems  as  a  base  for  new  or  improved  existing  systems. 
Reengineering  is  the  concept  of  creating  an  improved 
system  by  judiciously  reorganizing,  revising,  and  extend¬ 
ing  an  existing  system.  Reuse  is  a  related  concept  in  which 
a  set  of  existing  similar  systems  provide  the  basis  for  a 
product  line  of  new  systems.  The  Consortium’s  Synthesis 
methodology  integrates  reengineering  within  the  frame¬ 
work  of  a  systematic  reuse-driven  process  that  promises 
more  cost-effective  development  and  maintenance  of 
software  and  systems  in  the  future. 

Motivations  for  reengineering 

Reengineering  is  commonly  viewed  as  a  variant  of 
system  maintenance.  Maintenance  differs  from  other 
phases  of  system  engineering  in  that  its  focus  is  an 
operational  system  that  requires  modifications  to  correct 
errors,  to  support  customer  needs  more  effectively,  or  to 
satisfy  changed  needs.  Over  its  useful  lifetime,  a  system 
must  be  repeatedly  modified  to  stay  responsive  to  the 
needs  of  the  customers  it  is  intended  to  serve.  However, 
modifying  a  system  in  response  to  changing  needs 
inevitably  undermines  the  conceptual  and  structural 
integrity  and  subsequent  modifiability  of  the  system  as 
needs  continue  to  change.  Reengineering  is  distinguished 
from  other  forms  of  maintenance  because  it  presumes  the 
need  for  a  redesign  of  the  existing  system  to  make  current 
and  future  changes  more  cost-effective  and  less  error- 
prone.  It  is  a  type  of  maintenance  because  the  system  is 
not  rebuilt  from  scratch  but  is  derived  in  large  part  from 
the  artifacts  of  the  existing  system. 

This  material  is  based  in  part  upon  work  funded  by  the  Virginia 
Center  of  Excellence  for  Software  Reuse  and  Technology  Transfer, 
sponsored  by  the  Advanced  Research  Projects  Agency  under  Grant 
#  MDA972-92-J-1018.  The  content  does  not  necessarily  reflect  the 
position  or  the  policy  of  the  U.S.  Government,  and  no  official 
endorsement  should  be  inferred 


Reengineering  of  a  system  involves  first  the  analysis 
of  the  existing  system,  referred  to  as  reverse  engineering, 
and  then  reformulation,  restructuring,  and  modification  of 
the  system  so  that  required  changes  are  easier  to  make 
reliably.  Reengineering  may  be  needed  for  several 
reasons: 

•  E)ocumentation  of  the  system’s  requirements,  design, 
and  implementation  has  either  been  lost  or  become 
unreliable  because  of  subsequent  changes  to  the  im¬ 
plementation.  Original  needs  may  not  be  well  under¬ 
stood.  This  makes  it  difficult  and  risky  to  change  the 
system  because  of  uncertainties  in  how  parts  of  the 
system  interact  or  why  certain  functions  behave  as 
they  do. 

•  The  needs  served  by  the  system  have  changed 
sufficiently  that  the  original  design  is  no  longer  a 
good  solution.  A  redesign  is  required  for  the  system  to 
continue  to  be  acceptable  to  its  users. 

•  The  technology  upon  which  the  system  is  based  has 
become  obsolete.  To  accommodate  improved 
technology  and  better  serve  user  needs,  the  system 
must  be  substantially  redesigned. 

In  the  worst  case,  a  system  may  have  all  of  these 
problems.  Although  reengineering  can  accomplish  its 
intended  purpose  of  creating  a  better  structured,  more 
maintainable  system  in  the  short  run,  it  may  do  nothing  to 
avoid  recurrence  of  the  problems  that  led  to  the  need  for 
reengineering  in  the  first  place.  If  recurrence  of  these 
problems  is  not  somehow  prevented  by  the  reengineering 
or  avoided  in  subsequent  maintenance  of  the  system,  then 
after  some  time  reengineering  will  again  be  necessary. 
Taking  a  different  view  of  reengineering  can  reduce 
recurrence  of  these  problems. 

A  framework  for  reengineering 

The  driving  concern  for  reengineering  is  the  ability  to 
create  a  system  that  can  be  easily  changed  as  customer 
needs  change.  The  driving  concern  for  reuse  is  the  need  to 
field  multiple  systems  or  system  versions  that  satisfy 
similar  yet  differing  needs,  of  one  or  several  customers, 
without  having  to  repeatedly  develop  similar  software 
from  scratch.  In  reality,  these  two  concents  are  the  sante: 
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the  ability  to  produce  similar  systems,  whether  serially  or 
concurrently,  to  satisfy  similar  needs. 

Looking  more  closely  at  the  possible  motivations  for 
reengineering  a  system  reveals  several  alternative 
objectives: 

•  To  make  changes  in  the  functioning  or  operational 
properties  (e.g.,  reliability,  performance)  of  an 
existing  system 

•  To  make  it  easier  or  safer  to  make  current  and  future 
changes  in  an  existing  system 

•  To  use  existing  systems  as  a  foundation  for  simitar 
future  systems 

When  reengineering  is  motivated  only  by  the  first  of 
these  objectives,  the  situation  is  not  particularly  different 
from  that  of  conventional  development  and  maintenance. 
In  this  case,  the  objective  is  most  likely  addressed 
adequately  by  traditional  approaches  to  maintenance  in 
which  the  architecture  of  a  system  is  upgraded  or 
particular  data  structures  or  algorithms  are  replaced  by 
improved  alternatives.  The  other  two  objectives  warrant  a 
different  approach  based,  the  Consortium  believes,  on  the 
concept  of  program  families  [1,  2]. 

When  the  objective  of  reengineering  is  either  to  make 
a  system  easier  to  change  or  to  use  legacy  systems  as  a 
foundation  for  future  systems,  Dijkstra’s  and  Pamas’ 
concept  of  orienting  development  to  a  family  of  systems 
provides  significant  opportunities  for  leverage  in  compar¬ 
ison  to  a  traditional,  single-system  orientation.  Even  a 
single  system  inevitably  evolves  through  multiple  ver¬ 
sions  because  of  poorly  understood  requirements  or  to 
accommodate  changing  requirements  or  technology.  The 
Consortium’s  approach  to  reengineering  is  based  on 
families  of  systems,  within  the  framework  of  the 
Synthesis  methodology  for  reuse-driven  software  pro¬ 
cesses  [3].  Reengineering  within  a  Synthesis  process 
comprises  conventional  reverse  engineering  capabilities 
for  the  analysis  of  existing  artifacts  combined  with  an 
innovative  reuse-driven  approach  to  creating  and  using 
families  of  systems  as  a  basis  for  both  the  development 
and  maintenance  of  systems. 

A  reuse-driven  software  process 

An  organization’s  primary  motivation  for  instituting  a 
Synthesis  process  is  that  the  organization  perceives  itself 
as  having  expertise  in  and  serving  the  market  for  a 
cohesive  business  area.  The  market  has  the  need  for  cither 
a  single  evolving  system  or  several  similar  systems,  which 
in  cither  case  offers  a  basis  for  conceiving  a  family  of 
systems. 


The  essence  of  our  approach  is  that  development 
should  result  in  a  family  of  similar  systems  from  which  it 
is  possible  to  mechanically  derive  alternative  members  of 
the  family  for  rapid  delivery  to  customers.  A  family  is  not 
just  an  abstract  conception  but  a  concrete  formulation.  It 
is  designed  and  constructed  as  the  means  for  systematic 
production  and  modification  of  systems  to  satisfy  diverse 
or  changing  needs. 

A  Synthesis  process,  as  depicted  in  Figure  1,  consists 
of  two  major  activities:  domain  engineering  and  applica¬ 
tion  engineering.  These  activities,  described  briefly  here, 
are  defined  fully  and  in  detail  in  [4],  along  with  extensive 
practical  guidance. 

Application  engineering  is  concerned  entirely  with 
the  needs  of  a  particular  customer  and  with  producing  a 
system  that  effectively  addresses  those  needs.  Applica¬ 
tion  engineering  prototypically  consists  of  four 
subactivities: 

•  Project  management.  Planning,  monitoring,  and 
controlling  an  application  engineering  project  to 
respond  to  customer  needs. 

•  Application  modeling.  Formalizing  customer  needs 
and  analyzing  alternative  solutions  in  terms  of  a  set  of 
decisions  that  are  sufficient  to  distinguish  a  particular 
instance  of  a  supported  family  of  systems. 

•  Application  production.  Producing  a  system  by 
means  of  a  prescribed  mechanical  selection,  adapta¬ 
tion,  and  composition  of  reusable  components,  di¬ 
rected  by  the  decisions  made  in  application 
modeling. 

•  Delivery  and  operation  support.  Installing  a  system  in 
its  operational  environment,  training  users,  assisting 
them  in  effective  system  operation  and  identifying 
changes  that  will  make  the  system  a  better  fit  to 
customer  needs. 

Domain  engineering  focuses  on  how  to  make 
application  engineering  most  effective  in  meeting  both 
the  objectives  of  the  business  and  the  needs  of  the  targeted 
market.  To  achieve  this,  domain  engineering  formalizes  a 
family  of  systems  as  a  domain  by  identifying  the  common 
and  varying  features  of  the  type  of  systems  that  the  market 
requires.  Typically,  domain  engineering  supports  multi¬ 
ple  application  engineering  projects.  Domain  engineering 
consists  of  five  subactivities: 

•  Domain  management.  The  planning,  monitoring,  -and 
control  of  the  domain  engineering  effort.  This  en¬ 
compasses  coordination  with  application  engineer¬ 
ing  project  management  pnd  concern  for  all  facets  of 
proce.ss  management  including  configuration 
management  and  quality  assurance  disciplines. 
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•  Domain  definition.  Establishing  the  scope  and  extent 
of  the  domain  and  formalizing  the  variabilities  that 
differentiate  instances  of  the  targeted  family  of 
systems. 

•  Product  family  engineering.  Formalizing  standard¬ 
ized  (adaptable)  requirements,  design,  and  imple¬ 
mentation  for  the  family  of  systems  and  all  associated 
deliverable  and  supporting  work  products. 

•  Process  engineering.  Formalizing  a  definition  of  a 
standardized  application  engineering  process  and 
creating  automated  support  for  its  performance.  The 
prototypical  description  of  application  engineering 
described  above  is  tailored  to  suit  the  specific  needs 
of  the  domain  and  associated  projects. 

•  Project  support.  Assisting  application  engineering 
projects  to  make  effective  use  of  the  domain.  This  in¬ 
cludes  validating  whether  the  domain  is  responsive  to 
project  needs  and  identifying  needed  improvements 
and  changes. 

A  domain  is  a  formalization  of  a  family  of  systems  and 
an  associated  process  for  producing  members  of  the 
family.  A  system  is  represented  by  a  set  of  artifacts  (i.e., 
work  products).  A  system  is  not  just  a  collection  of 
implemented  (i.e.,  code)  components  but  includes 
associated  requirements/design/user  documentation,  test 
materials,  management  plans,  and  any  other  artifacts  that 
result  from  development  or  support  the  use  or 
maintenance  of  the  system.  Synthesis  is  concerned  with 


producing  all  of  these  when  a  system  is  needed.  Creating  a 
family  of  systems  means  creating  a  representation  of  each 
type  of  relevant  artifact  as  a  family  in  its  own  right. 
Furthermore,  artifacts  may  be  made  up  of  components 
which  are  in  turn  instances  of  component  families.  An 
essential  objective  of  a  Synthesis  process  is  to  create  a 
material  representation  of  all  necessary  system,  artifact, 
and  component  families. 

The  essence  of  a  family  in  this  sense  is  that  it  represent 
a  set  of  ‘similar’  individuals,  by  which  we  mean 
individual  things  that  are  identical  relative  to  a  specified 
set  of  traits.  A  family  is  formulated  as  an  abstraction  that 
denotes  a  set  of  similar  individuals  and  identifies  the 
particular  traits  that  determine  membership  in  the  family. 
Materially  representing  a  family  requires  expressing  not 
only  the  substance  of  similarity  but,  equally  important, 
the  details  of  variation  (out  of  which  come  distinct 
individuals).  Variations  are  additional  traits  that  together 
make  each  of  the  individual  members  of  a  family  unique 
and  correspond  to  decisions  that  are  necessarily  deferred 
until  a  particular  family  member  (i.e.,  individual  system, 
artifact,  or  component)  is  needed.  Production  of  an 
individual  then  reduces  to  resolving  these  deferred 
decisions  as  needed  to  designate  and  mechanically  derive 
the  corresponding  member  of  the  family. 

Methods  for  creating  and  using  component  families 
are  referred  to  as  metaprogramming  [5].  A  metaprogram¬ 
ming  technique  specifies  how  to  create  a  component 
family  and  subsequently  transform  it  into  concrete 
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instances.  Mechanisms  such  as  C  preprocessor  constructs, 
Ada  generics,  and  form-letter  capabilities  of  word 
processing  software  have  proven  sufficient  for  a  viable 
Synthesis  process.  Other,  special-purpose  mechanisms, 
which  are  more  complete  but  experimental,  may  provide 
additional  leverage. 

Experience  with  Synthesis 

The  Synthesis  approach,  until  now  emphasizing  reuse 
with  only  limited  concern  for  reengineering,  has  been 
used  extensively  by  several  industrial  organizations.  Two 
organizations,  in  particular,  have  contributed  significant¬ 
ly  to  understanding  Synthesis  and  how  to  achieve 
effective  reuse: 

•  Rockwell  Command  and  Control  Systems  Division. 
Rockwell  began  using  Synthesis  experimentally  in 
1990.  They  have  now  progressed  to  the  point  that 
they  are  evaluating  its  use  in  support  of  a  substantial 
business  area.  Their  experience  is  described  in  [6]. 

•  Boeing/NAVAIR  STARS*  demonstration  project. 
Boeing  evaluated  and  selected  the  Synthesis  method¬ 
ology  as  the  basis  for  its  demonstration  of  megapro¬ 
gramming  and  reuse  [7].  This  experience  is  now 
being  transferred  into  trial  use  of  Synthesis  by  the 
Naval  Training  Systems  Command  of  NAVAIR. 

In  addition  to  these  two  examples.  Synthesis  is  in 
experimental  use  on  projects  in  Martin-Marietta, 
Lockheed,  and  other  organizations.  Based  on  this  experi¬ 
ence,  Synthesis  is  proving  to  be  a  viable  and  sound 
approach  for  systematic  reuse-driven  software  engineer¬ 
ing.  The  experience  so  far  in  all  of  these  is  that  a  Synthesis 
process  provides  an  effective  capability  for  rapidly 
building  multiple  systems  or  system  versions  and  subse¬ 
quently  modifying  them  as  customer  needs  change.  We 
^lieve  that  Synthesis  also  provides  an  effective  frame¬ 
work  for  systematic  reengineering  as  an  aid  to  leveraging 
existing  systems  in  producing  new  or  improved  software 
and  systems. 

Reengineering  within  a  Synthesis  process 

An  organization  institutes  a  Synthesis  process  because 
it  has  expertise  in  a  targeted  business  area  and  intends  to 
serve  the  associated  market.  As  a  rule,  requisite  evidence 
of  sufficient  expertise  to  justify  such  a  business  commit¬ 
ment  is  thatthe  organization  has  produced  systems  for  this 
market  in  the  past.  Such  legacy  systems  are  a  good  initial 
source  from  which  to  create  a  domain  as  the  formalization 
of  a  family.  For  effective  use  of  legacy  systems, 
reengineering  is  an  integral  element  of  the  Synthesis 
process. 

•  STARS  is  the  Software  Technology  for  Adaptable  Reliable  Systems 
program  of  the  Advanced  Research  Projects  Agency. 


Just  as  the  emphasis  in  Synthesis  is  on  creating  a 
family  of  similar  systems,  the  result  of  reengineering 
within  Synthesis  should  be  not  just  an  ‘improved’  variant 
of  the  legacy  system(s)  but  a  family  of  similar  systems 
from  which  alternative  instances  of  the  family  can  be 
derived.  Derivable  instances  include  alternative  systems 
that  are  equivalent  to  an  initial  legacy  system  but 
improved  in  some  way  as  well  as  systems  that  are  useful 
hybrids  or  modifications  of  initial  legacy  systems. 

Within  Synthesis,  reengineering  is  not  viewed  as  a 
separate  activity.  Since  a  Synthesis  process  is  meant  to  be 
a  comprehensive  engineering  process,  many  of  the 
necessary  aspects  of  reengineering  are  already  a  part  of 
the  process.  Currently,  whenever  a  Synthesis  activity 
involves  the  creation  of  a  work  product,  it  accommodates 
the  analysis  of  existing  systems  as  one  source  of  the 
information  in  that  work  product.  For  example,  require¬ 
ments  specifications  of  legacy  systems  can  be  a  source  for 
determining  how  best  to  express  the  requirements  for  the 
family  as  a  whole.  Similarly,  test  cases  used  in  regression 
testing  of  those  systems  can  be  a  source  for  creating  test 
cases  to  be  used  in  testing  future  systems.  Only  the  use  of 
reverse  engineering  capabilities  need  further  elaboration 
as  integral  elements  of  Synthesis  activities.  In  large  part, 
this  means  the  enhancement  of  the  product  family 
engineering  activity  of  domain  engineering  to  describe 
and  explain  the  use  of  such  capabilities. 

A  family  of  systems  can  be  derived  initially  from 
either  a  single  or  several  similar  legacy  systems.  Reengi¬ 
neering  may  be  concerned  with  any  and  all  of  the  work 
products  associated  with  a  system.  When  a  system  is 
modified,  changes  are  rarely  limited  to  code  components; 
reengineering  should  facilitate  coordinated  change  across 
the  entire  set  of  artifacts  associated  with  a  system, 
including  requirements,  design,  code,  documentation, 
and  test  support. 

One  aspect  of  a  Synthesis  process  is  the  design  and 
implementation  of  component  families.  This  takes  the 
form  of  reengineering  when  components  are  available 
from  legacy  systems.  Reengineering  of  legacy  compo¬ 
nents  to  create  a  family  can  start  with  a  bottom-up  analys' s 
of  similarities  among  existing  components.  However,  in  a 
Synthesis  process,  analysis  is  guided  by  a  top-down 
specification  of  component  families  based  on  an  orga¬ 
nization’s  business  objectives.  The  challenges  in  creating 
a  viable  family  by  reengineering  are  to  identify  compo¬ 
nents  that  fit  sufficiently  within  the  scope  of  the 
envisioned  family  and  to  distinguish  essential  variations 
among  identified  components  (i.e.,  driven  by  sound 
customer  requirements  or  engineering  concerns)  from 
incidental  (and  therefore  unneeded)  variations.  The 
leverage  from  this  approach  to  reengineering  comes  from 
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recognizing  that  distinguishable  instances  can  be  derived 
from  the  unified  abstraction  of  a  family. 

System  reengineering  as  a  generalization  of 
software  reengineering 

System  engineering  is  concerned  with  hardware, 
software,  and  manual  procedures  and  the  interactions 
among  them.  Much  of  the  interest  in  reengineering  has 
focused  on  software  because  of  the  increasing  cost  of 
maintenance  associated  with  software  changes.  However, 
as  defense  spending  shrinks,  the  useful  life  of  individual 
systems  grows  longer.  To  respond  to  new  and  changing 
needs,  there  is  a  corresponding  need  and  benefit  in 
reengineering  complete  systems  comprising  hardware, 
software,  and  manual  procedures.  Reengineering  a  sys¬ 
tem  can  involve  coordinated  changes  to  any  of: 

•  The  system  architecture,  including  physical  and 
informational  coimections  and  the  number,  identity, 
and  capabilities  of  the  system’s  hardware  and 
software  components 

•  The  design  and  implementations  of  individual 
hardware  and  software  components 

■  The  business/organizational  and  user  processes 
within  which  the  hardware/software  system  operates 

Another  dimension  of  reengineering  at  the  system 
level,  in  contrast  with  the  software  level,  is  that  the 
tradeoff  between  hardware,  software,  and  manual  proce¬ 
dures  can  be  reconsidered.  As  technology  advances,  it 
becomes  easier  to  move  software  functions  into  custom 
hardware.  Alternatively,  moving  a  hardware  function  into 
software  can  increase  flexibility  for  modifying  it  in 
response  to  changing  needs.  Similarly,  as  system  usage 
matures  and  manual  procedures  become  more  standard¬ 
ized,  it  becomes  feasible  to  implement  more  of  them  in 
software. 

As  with  software-oriented  reengineering,  the  goal  of 
system  reengineering  should  not  be  narrowly  to  recon¬ 
struct  a  system  to  meet  current  needs  but  also  to  facilitate 
and  reduce  the  costs  of  future  changes  as  well.  From  this 
perspective,  significant  leverage  arises  from  considering 
overall  system  concerns,  as  well  as  those  related  to  each 
hardware,  software,  and  procedural  component  of  a 
system,  within  a  reengineering  approach.  Furthermore, 
the  similarity  of  motivations  for  reuse  and  reengineering 
justifies  a  unified  approach  for  systems  as  well  as  for 
software. 

Issues  in  reengineering 

Most  of  the  same  issues  that  make  system  engineering 
a  complex  task,  such  as: 


•  Understanding  the  real  requirements  for  a  system  so 
that  effort  is  not  wasted  solving  the  wrong  problem 

■  Evaluating  alternative  solutions  and  making 
engineering  tradeoffs  to  attain  a  proper  balance 
among  system  properties  such  as  pcrforniaiicc, 
reliability,  development  costs,  and  operating  costs 

•  Verifying  process  performance  and  intermediate 
work  products  and  validating  the  final  product  to  en¬ 
sure  that  the  problem  has  been  solved  properly  and 
correctly 

are  similarly  a  concern  for  reengineering.  In  addition  to 
these  common  concerns,  reengineering  raises  additional 
issues,  specifically  an  extended  verification  problem  and 
a  deoptimization  problem  in  reverse  engineering.  These 
problems  are  inherent  to  reengineering,  regardless  of 
approach. 

Whenever  a  system  is  constructed,  it  must  be 
validated  to  determine  whether  it  satisfies  the  customer’s 
actual  needs.  Because  in  the  context  of  reengineering  an 
operational  system  already  exists,  it  is  reasonable  to 
expect  that  validation  reduces  to  a  problem  of  verifying 
the  replacement  system  as  the  equivalent  of  the  existing 
system.  When  the  replacement  system  is  supposed  to  have 
identical  functionality  to  the  current  system,  this  equates 
to  a  total  regression  test  of  the  replacement  system. 
However,  creating  a  replacement  system  with  identical 
functionality  is  seldom  feasible  or  necessary;  creating 
only  near-identical  functionality  is  usually  sufficient  and 
less  costly.  Unfortunately,  a  divergence  from  identical 
functionality  makes  regression  testing  much  more  diffi¬ 
cult.  When,  as  is  often  the  case  because  of  changed  needs, 
the  replacement  system  also  must  differ  in  certain 
functionality  from  that  of  the  current  system,  the  problem 
takes  on  the  characteristics  to  a  greater  or  lesser  degree  of 
a  complete  revalidation.  For  reengineering  to  be  practical, 
the  effort  of  not  only  development  but  of  verification  and 
validation  as  well  must  be  significantly  reduced  as 
compared  to  that  of  completing  the  system  from  scratch. 

Reengineering  generally  requites  the  reverse 
engineering  of  a  system  for  recovery  of  missing  or 
obsolete  design  information  or  to  establish  precise, 
as-implemented  requirements.  Unfortunately,  particular¬ 
ly  in  the  case  of  software,  the  as-implemented  structure  is 
often  an  optimized  equivalent  of  the  intended  design.  For 
example,  real-time  embedded  software  usually  entails 
responding  to  asynchronous  events  in  the  environment; 
logically,  this  corresponds  to  an  architecture  consisting  of 
multiple  concurrent  processes.  However,  to  satisfy  strin¬ 
gent  performance  constraints,  such  an  architecture  has 
traditionally  been  implemented  in  a  cyclic  executive  in 
which  the  logic  of  the  procc.sscs  is  interleaved  in  a 
nonobvious  fashion.  Trivial  reverse  engineering  would 
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not  reveal  the  true  logical  structure  of  the  software  but 
instead  would  describe  a  much  more  complex  linear 
structure.  Reverse  engineering  techniques  must  be  devel¬ 
oped  that  help  discover  the  original  requirements  and 
design  while  recognizing  that  the  implementation  is 
actually  an  optimization. 

Conclusions 

The  Synthesis  methodology  for  domain-specific 
software  development  offers  a  comprehensive  framework 
for  a  reuse-  and  reengineering-based  approach  to  revital¬ 
izing  existing  operational  systems  and  producing  new, 
more  maintainable  systems.  Issues  remain  in  the  specific 
methods  and  technologies  of  reengineering,  reuse -driven 
product  lines,  and  process  automation.  However,  based  on 
extensive  trial  use  by  industry  and  government,  the 
essential  process  framework  is  sound.  Further  work  will 
demonstrate  the  benefits  of  taking  such  a  product  line 
perspective  whether  the  motivation  is  to  reduce  the  costs 
of  new  development  or  the  costs  of  maintenance  and 
whether  the  focus  is  on  software  or  on  systems. 
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Abstract 

The  United  States  Air  Force’s  Wright  Laboratory 
and  TASC  are  developing  an  environment  for  the 
reengineering  of  software  from  one  language  to  an¬ 
other.  Our  approach  engineers  a  program  in  the  new 
language  by  reusing  portions  of  the  original  implemen¬ 
tation  and  design.  We  use  reverse  engineering  to  fa¬ 
cilitate  understanding,  design  recovery,  viewing,  and 
navigating  of  the  subject  system.  We  use  computer- 
assisted  restructuring  to  aid  the  engineer  in  developing 
a  program  using  design  and  implementation  informa¬ 
tion  recovered  from  the  subject  system.  We  use  au¬ 
tomatic  translation  of  low-level  program  statements  to 
free  the  engineer  from  the  tedium  associated  with  syn¬ 
tactic  differences  between  languages.  This  paper  de¬ 
scribes  our  reengineering  process  model,  the  design  of 
our  reengineering  environment,  and  the  current  state 
of  the  implementation. 


1  Introduction 

The  reengineering  of  software  from  one  language 
to  another  is  becoming  a  necessity  as  Department  of 
Defense  organizations  strive  to  modernize  and  improve 
the  maintainability  of  their  systems  while  avoiding  the 
excessive  costs  of  new  development.  Systems  that  have 
been  in  use  for  years  often  incur  large  maintenance 
costs  [6]  for  a  number  of  reasons,  including: 

•  Continual  maintenance  has  made  the  current  im¬ 
plementation  and  original  design  inconsistent, 
made  the  code  harder  to  understand  and  error- 
prone,  and  made  the  documentation  out-of-date. 

•  They  are  written  in  languages  that,  while  once 
popular,  have  fallen  out  of  favor.  The  limited  se¬ 
lection  of  support  tools  for  these  languages,  the 
corresponding  expense  of  these  tools,  and  the 
shrinking  pool  of  qualified  programmers  to  main¬ 
tain  the  software  adds  to  the  expense  of  mainte¬ 
nance. 

•  They  were  developed  without  modern  software 
engineering  practices,  resulting  in  code  that,  by 
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today’s  standards,  lacks  structure  and  is  difficult 
to  understand. 

•  Employee  turnover  has  reduced  the  amount  of  un¬ 
derstanding  and  “intimate”  knowledge  of  the  sys¬ 
tem. 

TASC,  under  the  auspices  of  Wright  Laboratory,  is 
currently  developing  an  environment  for  reengineer¬ 
ing  software  from  one  language  to  another  as  part  of 
the  Avionics  Software  Reengineering  Technology  (AS- 
RET)  project.  We  are  initially  concentrating  on  the 
reengineering  of  avionics  simulation  software  written 
in  FORTRAN  to  Ada,  but  the  environment  is  designed 
so  that  additional  languages  can  be  supported  in  the 
future. 

Under  the  AS  RET  project,  we  have  performed  an 
extensive  investigation  of  existing  reengineering  ind 
reverse  engineering  processes,  techniques,  and  tools 
[23].  Based  upon  this  study,  we  have  developed  a  pro¬ 
cess  model  that  defines  reengineering  in  terms  of  (non- 
destructively)  engineering  a  new  program  by  reusing 
the  design  and  implementation  of  the  original  pro¬ 
gram.  The  process  model  is  consistent  with  well- 
accepted  reengineering  models  [4,  5],  and  improves 
on  them  by  dividing  automated  restructuring  from  re¬ 
structuring  that  requires  human  insight  and  by  defin¬ 
ing  restructuring  tasks  in  terms  of  modern  software 
engineering  practices. 

We  have  designed  and  are  currently  implementing 
a  reengineering  tool  (RET)  that  automates  portions 
of  the  process  model  an^  incorporates  selected  tech¬ 
niques  from  the  study.  With  our  system,  the  engi¬ 
neer  non-destructively  develops  a  new  Ada  program 
by  reusing  parts  of  the  original  FORTRAN  design 
and  implementation,  as  opposed  to  changing  the  orig¬ 
inal  FORTRAN  into  Ada.  For  example,  an  engineer 
can  run  an  automatic  packaging  routine  that  extracts 
FORTRAN  subprograms,  translates  their  declarations 
into  Ada,  and  arranges  them  into  packages  based  upon 
their  data  usage.  The  engineer  can  then  rearrange  the 
resulting  Ada  subprograms  interactively.  When  satis¬ 
fied  with  the  package  structure,  the  engineer  can  di¬ 
rect  the  system  to  automatically  translate  statements 
in  the  bodies  of  the  subprograms.  We  are  currently 
implementing  the  RET. 

Most  existing  reengineering  tools  fall  into  one  of 
two  categories: 
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•  Reverse  engineering  and  redocuinentation  tools 
[8,  23,  20]  that  present  different  views  of  the  struc¬ 
ture  of  a  program,  such  as  control  flow  and  data 
flow  graphs,  to  aid  in  program  understanding  and 
manual  reengineering. 

•  Other  tools  [18,  24]  support  automatic  transla¬ 
tion  from  one  language  to  another  or  forms  of 
automated  restructuring,  such  as  the  removal  of 
GOTOs.  These  tools  require  little  human  inter¬ 
action  but,  because  of  this,  provide  little  support 
for  design  recovery  or  improvement. 

Our  approach  can  be  described  as  computer-assisted 
reengineering.  It  provides  automated  reverse  engineer¬ 
ing,  redocumentation,  and  translation  of  low-level  pro¬ 
gram  entities,  but  also  provides  a  combination  of  user 
interaction  and  automated  analysis  to  reorganize  pro¬ 
gram  statements  and  data  into  new  modules.  The 
RET  will  relieve  the  tedium  associated  with  syntactic 
minutia,  i.e.,  differences  between  the  source  and  target 
programming  language  syntax,  and  allow  the  engineer 
to  concentrate  on  the  more  important  design  and  im¬ 
plementation  decisions  that  will  make  the  reengineered 
system  more  maintainable. 

The  rest  of  this  paper  is  structured  as  follows:  Sec¬ 
tion  2  introduces  our  reengineering  process  model  and 
compares  it  with  existing  reengineering  process  mod¬ 
els.  Section  3  describes  the  RET  design.  Section  4 
describes  the  current  state  of  the  RET  implementa¬ 
tion.  Section  5  presents  some  concluding  remarks. 

2  The  Reuse-Based  Reengineering 
Process 

Our  reengineering  process  model  as  applied  to  the 
RET  domain  is  illustrated  in  Figure  1.  Steps  in  the 
process  label  the  boxes  in  the  figure  and  inputs  and 
outputs  for  each  step  label  the  icons  between  boxes. 
The  process  model  specifies  a  set  of  tasks  (the  steps 
of  the  process)  that  should  be  performed  and  the  se¬ 
quence  in  which  they  should  be  performed  to  reengi¬ 
neer  a  program  in  one  language,  e.g.,  FORTRAN,  to 
another  language,  e.g.,  Ada.  The  source  and  target 
languages  can  be  the  same  if  the  goal  of  reengineering 
is  simply  to  improve  the  structure  of  the  program  with¬ 
out  moving  to  a  different  language.  The  process  model 
also  specifies  the  information  necessary  and  desirable 
to  support  these  tasks.  The  process  model  does  not 
specify  how  the  tasks  are  to  be  performed,  i.e.,  they 
might  be  automated  (as  many  are  in  the  RET)  or  they 
might  be  performed  manually. 

The  first  step  in  the  process  model  is  to  perform 
some  preliminary  restructuring  of  the  source  code  of 
the  original  implementation.  Preliminary  restructur¬ 
ing  improves  the  layout  of  the  source  code  by  remov¬ 
ing  unstructured  program  constructs,  such  as  GOTO 
statements,  dead  code,  and  implicit  types.  Prelimi¬ 
nary  restructuring  is  separated  from  the  later  restruc¬ 
turing  step  because  it  can  be  completely  automated  by 
commercial  tools,  and  placed  first  in  the  process  model 
because  the  structured  version  of  the  source  program 
is  usually  easier  to  analyze,  understand,  and  restruc¬ 
ture. 


After  preliminary  restructuring  is  complete,  the  im¬ 
proved  source  code  is  analyzed  and  representations  of 
the  program  are  constructed.  Some  of  the  representa¬ 
tions,  e.g.,  abstract  syntax  graphs  (ASGs)  and  sym¬ 
bol  tables,  are  machine-readable  representations  used 
only  by  automated  restructuring  and  redesign  tasks. 
Others,  e.g.,  flow  graphs  and  structure  charts,  aid  in 
program  understanding,  redocumentation,  and  man¬ 
ual  restructuring.  For  manual  restructuring,  the  set 
of  representations  will  certainly  contain  a  source  code 
listing. 

The  restructuring,  redesign,  and  redocumentation 
steps  of  the  process  model  are  performed  multiple 
times,  each  time  building  upon  the  results  of  the  previ¬ 
ous  pass.  A  multi-pass  approach  is  necessary  because, 
in  programs  of  reasonable  size,  it  is  easier  and  less 
error-prone  to  reengineer  a  program  in  stages,  verify¬ 
ing  the  program  after  each  pass.  Restructuring,  i.e., 
changing  the  structure  of  the  program  without  chang¬ 
ing  its  functionality,  should  be  performed  first,  possi¬ 
bly  in  several  passes.  These  passes  should  perform  the 
following  steps: 

•  Macro  control  restructuring  -  Grouping  state¬ 
ments  and  control  structures  of  the  program 
into  modules,  such  as  procedures,  functions,  and 
packages.  This  includes  recovering  modules  of 
the  original  program,  generating  new  modules, 
and  specifying  a  declaration  nesting  structure  for 
modules. 

•  Macro  data  restructuring  -  Grouping  of  data 
items,  such  as  types,  variables,  and  constants,  and 
associating  them  with  modules  created  during 
macro  control  restructuring.  This  includes  recov¬ 
ering  data  groupings  of  the  original  program,  cre¬ 
ating  new  groupings,  and  creating  abstract  data 
types  and  records. 

•  Micro  control  rest7~uctu7-ing  -  Manipulation  of 
individual  control  structures.  This  includes 
the  translation  of  individual  statements  and 
functionality-maintaining  alterations,  such  as 
code  lifting  [1]. 

•  Micro  data  restructuring  -  Manipulation  of  indi¬ 
vidual  data  items.  This  includes  actions  such  as 
translation,  changing  names,  changing  types,  cre¬ 
ating  symbolic  constants,  and  changing  the  scope 
of  variables. 

Macro  control  and  data  restructuring  should  be  per¬ 
formed  first  to  develop  a  modular  structure  for  the 
target  system,  followed  by  micro  control  and  data  re¬ 
structuring  to  restructure  individual  components  of 
the  program. 

After  restructuring  is  complete,  code  in  the  target 
language  should  be  generated  and  the  program  should 
be  tested  to  ensure  that  the  restructuring  did  not  in¬ 
troduce  any  errors  or  undesired  functional  changes. 
The  test  data  of  the  original  program  can  be  used  and 
the  results  compared  with  the  results  of  testing  the 
original  program.  In  many  cases,  the  test  data  will 
need  to  be  reengineered  to  work  with  the  reengineered 
program.  Any  differences  in  the  results  of  testing  will 


34 


Figure  1:  The  reengineering  process  model. 


indicate  the  introduction  of  an  unexpected  functional 
change  during  restructuring.  Coverage  analysis  should 
be  performed  during  the  testing  of  the  target  code  be¬ 
cause  restructuring  could  have  introduced  or  altered 
control  and  data  characteristics  of  the  program.  When 
an  error  in  the  target  program  is  indicated,  the  pro¬ 
gram  can  be  corrected  by  amending  the  target  code 
directly  or  by  restructuring  the  representations  and 
regenerating  the  target  code. 

Once  the  program  has  been  restructured  and  a  func¬ 
tionally  equivalent  program  in  the  target  language 
has  been  created,  the  engineer  can  perform  additional 
restructuring  and  redesign  actions  on  the  program. 
These  steps  use  the  same  set  of  actions,  i.e.,  macro 


control,  macro  data,  micro  control,  and  micro  data, 
but  have  different  goals.  Further  restructuring  is  per¬ 
formed  to  further  improve  the  structure  of  the  pro¬ 
gram.  Redesign  has  the  goal  of  changing  the  func¬ 
tionality  of  the  program,  e.g.,  to  correct  design  flaws 
or  improve  the  design.  If  the  target  program  code 
was  edited  to  correct  errors  indicated  during  testing, 
the  code  is  analyzed  to  generate  representations  before 
subsequent  restructuring  and  redesign  is  performed. 

Redocumentation  is  performed  simultaneously  with 
the  restructuring  and  redesign  steps,  e.g., 

•  The  generated  representations  can  be  saved  and 
serve  as  documentation  of  the  program  structure 
and  design. 
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•  The  engineer  can  add  comments  and  annotations 
during  restructuring  and  redesign  as  he  gains  in¬ 
sights  about  the  code  or  design. 

The  RET  reengineering  process  model  includes 
modern  software  development  processes,  such  as  con¬ 
tinuous  testing,  iterative  restructuring  and  redesign, 
and  configuration  management  The  process  model  is 
a  specialization  of  the  Chikofsky-Cross  process  model 
[5,  8].  The  entire  Chikofsky-Cross  model  is  repre¬ 
sented,  although  there  are  the  following  changes: 

•  Inclusion  of  program  management  extensions  to 
the  process  model  [4],  such  as  configuration  man¬ 
agement  and  testing. 

•  Separation  of  easily  automated  steps,  such  as  pre¬ 
liminary  restructuring,  so  they  can  be  addressed 
by  commercial  tools. 

•  Decomposition  of  Chikofsky-Cross  steps,  such  as 
restructuring  being  decomposed  into  macro  con¬ 
trol,  macro  data,  micro  control,  and  micro  data 
restructuring. 

•  Explicit  introduction  of  the  iteration  steps  that 
are  implicit  in  the  Chikofsky-Cross  process. 

3  RET  Design  Overview 

The  Reengineering  Tool  (RET)  assists  the  software 
engineer  in  understanding  and  improving  the  structure 
of  an  avionics  software  support  system  while  translat¬ 
ing  its  source  code  from  the  source  programming  lan¬ 
guage  (FORTRAN)  to  the  target  programming  lan¬ 
guage  (Ada). 

The  RET  comprises  two  distinct  logical  parts  called 
the  Left-Hand  Side  (LHS)  and  the  Rght-Hand  Side 
(RHS).  The  LHS  provides  views  of  the  original  FOR¬ 
TRAN  program,  or  subject  system.  The  RHS  provides 
views  of  the  Ada  program  being  developed,  i.e.,  the 
target  system.  The  LHS  allows  the  engineer  to  navi¬ 
gate  and  view  aspects  of  the  subject  system,  but  does 
not  support  changing  the  subject  system.  The  RHS 
supports  constructing,  refining,  viewing,  and  navigat¬ 
ing  the  target  system.  The  engineer  constructs  a  ba¬ 
sic  structure  for  the  RHS  (macro  restructuring)  using 
information  extracted  from  the  LHS.  Once  the  basic 
structure  of  the  RHS  is  established,  the  engineer  re¬ 
fines  the  target  system  (micro  restructuring)  on  the 
RHS. 

Semi-automated  RET  components  support  con¬ 
struction  activities;  they  suggest  larg^ -scale  reorgani¬ 
zations  of  the  subject  system  and  populate  the  RHS 
with  the  basic  structure  of  the  target  system.  The 
components  that  support  refinement  allow  the  engi¬ 
neer  to  apply  knowledge,  which  is  beyond  the  RET, 
and  human  insight,  which  is  lacking  in  the  semi- 
automated  support  provided  by  the  RET,  to  modify 
and  improve  the  RHS  representations. 

3.1  Representations 

The  RET  provides  two  primary  internal  representa¬ 
tions  (PIRs):  the  Abstract  Syntax  Graph  (ASG)  and 


the  Symbol  Table  (ST).  Secondary  internal  represen¬ 
tations  (SlRs)  are  derived  from  the  PIRs.  The  SlRs 
are  the  underlying  data  structures  for  the  views  pre¬ 
sented  to  the  engineer.  The  RET  provides  six  views 
for  each  side: 

•  Source  Code  Listing  (SCL)  -  displays  the  FOR¬ 
TRAN  or  Ada  source  code 

•  Declaration  Diagram  (DED)  -  displays  the  pro¬ 
gram  declaration  nesting  structure 

•  Call  Diagram  (CD)  -  displays  the  subprogram 
calling  structure 

•  Data  Flow  Diagram  (DFD)  -  displays  how  data 
flows  through  the  program 

•  Hypertext  Annotations  (HA)  -  displays  trees  and 
networks  of  textual  commentary  provided  by  the 
engineer 

•  Data  Dictionary  (DD)  -  displays  various  cross  ref¬ 
erences 

3.2  The  RET  Architecture 

Figure  2  shows  the  RET  architecture.  It  depicts 
the  organization  of  major  RET  components,  and  in¬ 
dicates  the  data  flow  relationships  among  them.  The 
Preliminary  Restructurer  (PR)  performs  control  flow 
restructuring.  The  Source  Code  Processor  (SCP)  gen¬ 
erates  the  LHS  PIRs. 

The  engineer  constructs  the  RHS  PIRs  using  the 
Packager  (PACK)  and  Transformer  (TRAN).  The 
Representation  Generator  (RG)  creates  the  SlRs  on 
both  sides  from  the  PIRs.  The  User  Interface  and 
Display  (UID)  creates  the  corresponding  views,  and 
provides  the  means  by  which  the  engineer  interacts 
with  the  views  on  both  sides  and  alters  the  views  on 
the  RHS.  The  Transformer  implements  the  changes 
by  transforming  the  RHS  PIRs,  and  the  Representa¬ 
tion  Generator  propagates  the  changes  to  the  RHS 
SlRs.  The  User  Interface  and  Display  refreshes  the 
RHS  views  in  response  to  the  changes. 

The  File  System  Interface  (FSI)  manages  external 
persistent  data  and  the  Object  Base  (OB)  manages 
internal  data. 

The  views,  PIRs,  and  SlRs  are  thus  interdependent, 
but  the  engineer  need  not  be  aware  that  the  PiRs  and 
SlRs  exist.  Any  changes  that  the  engineer  makes  to 
the  target  system  through  the  views  provided  by  the 
User  Interface  and  Display  may  appear  to  affect  the 
views  exclusively.  The  components  are  described  in 
more  detail  below. 

Preliminary  Restructurer.  The  Preliminary  Re¬ 
structurer  (PR)  restructures  the  control  flow  of 
the  original  FORTRAN  source  code  by  eliminating 
branches  into  or  out  of  loops  and  decisions  It  elim¬ 
inates  all  GOTO  statements,  leaving  only  the  pure 
structured  programming  constructs:  sequence,  selec¬ 
tion,  and  iteration.  We  refer  to  this  specialized  form 
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Figure  2:  The  RET  architecture. 


of  restructuring  as  control  flow  restructuring  to  distin- 
uish  it  from  the  more  general  concept  of  restructuring 
escribed  in  Section  2. 

The  Preliminary  Restructurer  is  applied  as  a  pre¬ 
processing  step  and  the  RET  assumes,  but  does  not 
require,  that  the  FORTRAN  source  code  has  already 
undergone  control  flow  restructuring.  There  are  two 
reasons  for  this  design.  The  first  is  that  the  subject 
system  is  not  always  poorly  structured  with  respect 
to  control  flow,  so  the  step  should  be  optional.  The 
second  reason  is  that  the  design  allows  any  control  flow 
restructuring  product  to  be  used  without  integrating 
it  into  the  RET. 

Source  Code  Processor.  The  Source  Code  Pro¬ 
cessor  (SCP)  reads  the  FORTRAN  source  code,  per¬ 
forms  analyses,  and  generates  the  PIRs.  The  PIRs 
represent  the  structure  of  the  program,  semantic  in¬ 
formation  about  program  components,  and  data  flow 
information. 

The  Source  Code  Processor  will  also  detect  ap¬ 
parent  undefined-reference  (U-R)  data  flow  anomalies 
(DFA)  on  the  LHS  by  static  analysis  of  the  LHS.  A  U- 
R/DFA  is  an  occurrence  of  a  variable  binding  which  is 
referenced  before  it  is  assigned  a  value.  An  instance  of 
this  involving  some  local  VAX/FORTRAN  variable  in 
a  subprogram  is  called  an  apparent  U-R/DFA  because 
it  is  not  necessarily  a  data  flow  anomaly,  even  though 
it  may  be  detected  as  such  by  the  static  analysis.  The 
(VAX/FORTRAN)  variable  enjoys  static  extent  [21], 
and  may  have  been  assigned  a  value  on  a  prior  call  of 
the  subprogram.  Ada  variables  have  automatic  extent 
[21],  so  it  is  important  for  the  RET  to  identify  these 
situations  and  promote  the  variables  to  an  appropriate 
enclosing  scope. 

Representation  Generator.  The  Representation 
Generator  (RG)  creates  the  LHS  and  RHS  SIRs.  The 


Representation  Generator  generates  each  SIR  when  its 
corresponding  view  is  requested  by  the  engineer.  Once 
the  RHS  SIRs  are  created,  they  may  become  inconsis¬ 
tent  with  the  PIRs  as  the  latter  are  restructured.  The 
engineer  can  request  a  refresh,  which  regenerates  the 
SIRs  from  the  current  PIRs  or  an  environment  vari¬ 
able  can  be  set  so  that  SIRs  are  refreshed  periodically. 

The  Representation  Generator  creates  SIRs  for  the 
DFD,  CD,  and  DED  views.  The  Representation  Gen¬ 
erator  produces  the  DFD  according  to  a  method  for 
creating  Hierarchical  Data  Flow  Diagrams  given  in  [2]. 
It  generates  the  CD  in  a  straightforward  manner  from 
information  in  the  ST.  The  DED  is  a  canonical  orga¬ 
nization  of  information  in  the  ST. 

The  Representation  Generator  does  not  create  SIRs 
for  the  SRC  or  HA  views.  The  SRC  view  is  produced 
by  the  DIALECT  [12]  printer.  Hypertext  annotations 
are  provided  by  the  engineer  during  use  of  the  RET. 
The  RET  allows  the  engineer  to  enter  textual  com¬ 
ments,  i.e.,  annotations,  describing  insights,  recovered 
design  information,  or  any  other  information.  The  en¬ 
gineer  can  associate  each  annotation  with  any  part 
of  the  views,  PIRs,  or  SIRs  in  a  hypertext  network. 
The  HA  is  maintained  by  the  Representation  Gener¬ 
ator  and  can  be  incorporated  as  in-line  comments  or 
notes  during  code  generation  or  redocumentation  by 
the  File  System  Interface. 

Restructurer.  The  Restructurer  fRES)  component 
comprises  the  Packager  (PACK)  ana  the  Transformer 
(TRAN).  The  Restructurer  helps  the  engineer  develop 
an  Ada  program  on  the  RHS  by  transforming  and 
reusing  components  of  the  original  FORTRAN  pro¬ 
gram  from  the  LHS.  The  Packager  assists  the  engineer 
with  macro  restructuring.  The  Transformer  supports 
both  macro  and  micro  restructuring. 

All  restructuring  activities  performed  by  the  engi¬ 
neer  using  the  Restructurer  manipulate  the  PIRs  ex- 


clusively,  by  initially  creating  RHS  symbol  tables  and 
abstract  syntaux  graphs  and  then  by  populating  and 
transforming  them.  The  information  is  entered  by  the 
engineer  through  the  views  and,  once  the  PIRs  have 
been  transformed,  the  SIRs  are  regenerated  and  the 
views  are  refreshed.  Thus,  the  underlying  PIRs  and 
SIRs  are  hidden  from  the  engineer  and  it  appears  as 
if  the  views  are  being  transformed  directly. 

Packager.  The  Packager  constructs  or  initializes  the 
RHS  ASG.  It  recognizes  subprogram,  object,  and  type 
entities  from  the  LHS  PIR  and  requests  the  Trans¬ 
former  to  transform  them  from  the  LHS  domain  model 
to  the  RHS  domain  model,  and  to  insert  them  into  the 
RHS  ASG.  (A  domain  model  [3]  is  a  kind  of  objoct- 
based  database  schema.)  The  Packager  builds  an  Ada 
ASG  on  the  RHS  and  calls  upon  REFINE/Ada  [15, 16] 
for  semantic  analysis. 

The  Packager  groups  the  entities  into  Ada  pack¬ 
ages  on  the  RHS  by  applying  an  interactive  clustering 
technique  based  upon  [7,  9,  10,  19].  The  clustering 
technique  provides  only  a  first  approximation  to  a  rea¬ 
sonable  Ada  package  structure.  We  expect  that  the 
engineer  will  need  to  interactively  refine  the  generated 
grouping. 

TVansformer.  The  TYansformer  assists  the  Pack¬ 
ager  with  both  macro  and  micro  restructuring.  For 
macro  restructuring,  the  Transformer  automatically 
transforms  low-level  entities  from  the  FORTRAN  do¬ 
main  model  to  the  Ada  domain  model  by  applying 
rules  that  insert  subgraphs  and  other  information  into 
the  Ada  PIRs  corresponding  to  subgraphs  and  infor¬ 
mation  in  the  FORTRAN  PIRs.  For  micro  restruc¬ 
turing,  the  Tl-ansformer  implements  changes  to  low- 
level  Ada  entities  by  allowing  the  engineer  to  select  a 
portion  of  the  Ada  program  under  development  and 
change  it  by  applying  a  rule  or  by  editing,  deleting,  or 
inserting  text. 

The  engineer  may  indirectly  provide  input  to  the 
Transformer  by  selecting  portions  of  a  view  and  then 
interacting  with  the  view  to  change  its  SIR.  The  Trans¬ 
former  implements  the  change  by  transforming  corre¬ 
sponding  portions  of  the  PIRs  while  the  Representa¬ 
tion  Generator  propagates  the  results  of  the  transfor¬ 
mation  to  the  other  SIRs.  The  User  Interface  and 
Display  updates  the  views  so  that  it  may  appear  to 
the  engineer  that  the  change  was  made  directly  to  the 
selected  view. 

User  Interface  and  Display,  File  System  Inter¬ 
face,  Object  Base.  The  RET  provides  two  external 
interfaces.  The  engineer  communicates  with  the  RET 
through  the  User  Interface  and  Display  (UID).  The 
UID  shows  the  views,  prompts  the  engineer  for  input, 
receives  commands  and  selections  from  the  engineer, 
and  delivers  them  to  the  other  components. 

The  File  System  Interface  (FSI)  is  responsible  for 
the  storage  and  retrieval  of  persistent  data.  The  FSI 
inputs  the  (FORTRAN)  source  code  of  the  subject  sys¬ 
tem,  reads  and  writes  intermediate  data,  and  outputs 
the  (Ada)  source  code  for  the  target  system  and  other 


subject  and  target  system  views.  The  intermediate 
data  is  stored  in  the  Obje'-t  Ba.^c  (OB). 

4  Implementation 

4.1  RET  Development  Environment 

We  are  developing  the  RET  on  a  SPARCstation 
10/40  under  Sun  OS  4.1.3.  The  RET  utilizes  several 
commercial  tools,  including  the  following  tools  by  Rea¬ 
soning  Systems: 

•  The  REFINE  language  [14]  -  a  high-level  lan¬ 
guage  that  includes  object-oriented,  rule-based, 
and  iterative  characteristics.  It  includes  an  ob¬ 
ject  database  and  an  environment  that  facilitates 
interactive  development  and  testing. 

•  DIALECT  [12]  -  A  tool  for  building  parsers  and 
code  generators. 

•  INTERVISTA  [13]  -  A  tool  for  building  user  in¬ 
terfaces,  including  mouse-sensitive  windows  and 
menus. 

•  REFINE/FORTRAN  [17]  -  A  tool  that  parses  and 
analyzes  FORTRAN  code.  It  includes  a  printer 
for  FORTRAN  that  generates  source  code  from  a 
symbol  table  and  abstract  syntaoc  graph. 

•  REFINE/Ada  [16]  -  A  tool  that  parses  and  ana¬ 
lyzes  Ada  code.  It  includes  a  printer  for  Ada  that 
generates  source  code  from  a  symbol  table  and 
abstract  syntax  graph. 

•  A  number  of  unsupported  systems  that  can  be 
used  in  the  RET  development,  such  as  hypertext 
and  fast  dump/load  facilities. 

4.2  Implementation  of  RET  Components 

We  are  implementing  the  components  of  the  RET 
as  follows: 

•  Preliminary  Restructurer  -  The  SPAG  com¬ 
ponent  of  the  plusFORT  product  [11]  will  provide 
the  entire  Preliminary  Restructurer. 

•  Source  Code  Processor  -  REFINE/FOR¬ 
TRAN  provides  most  of  the  functionality  for 
the  Source  Code  Processor  component  of  the 
RET.  We  have  made  slight  extensions  to  RE¬ 
FINE/FORTRAN  to  gather  and  summarize  infor¬ 
mation  needed  by  other  RET  components.  RE¬ 
FINE/FORTRAN  also  provides  most  of  the  LHS 
PIRs. 

•  Representation  Generator  -  We  are  imple¬ 
menting  the  RG  component  in  the  REFINE  lan¬ 
guage,  using  the  REFINE  Object  Base  for  the 
SIR  structures.  REFINE/FORTRAN  and  RE¬ 
FINE/Ada  provide  the  Source  Code  Listing  SIRs 
for  the  LHS  and  RHS,  respectively. 
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•  Restructui’er-  We  are  developing  the  RES  com¬ 
ponent  using  the  REFINE  language.  Tlie  Re- 
structurer  will  extract  information  from  the  LHS 
PIRs  either  interactively  or  automatically,  and 
apply  REFINE  transformations  to  generate  the 
RHS  PIRs. 

•  User  Interface  and  Display  -  We  are  develop¬ 
ing  the  UID  using  INTERVISTA.  INTERVISTA 
provides  support  for  developing  mouse-sensitive 
windows,  menus,  and  diagrams.  It  is  based  upon 
Allegro  Common  Windows. 

•  File  System  Interface  -  REFINE  provides  fa¬ 
cilities  for  saving  the  state  of  a  session  and  for 
storing  some  of  the  representations  to  disk.  We 
are  developing  additional  code  to  save  the  SIRs 
to  disk. 

•  Object  Base  -  The  object  base  is  provided  by 
the  REFINE  Object  Base. 

4.3  Status  of  the  RET  Implementation 

We  have  developed  and  tested  most  of  the  Source 
Code  Processor,  Representation  Generator,  and  File 
System  Interface  components.  We  are  developing  the 
associated  User  Interface  and  Display  facilities  along 
with  each  component.  We  have  just  begun  to  im¬ 
plement  the  Restructurer  component.  We  expect  to 
complete  the  first  prototype/demonstrai-ion  system  by 
mid-Spring  1994,  after  which  development  will  con¬ 
tinue. 

5  Conclusions 

We  discovered  during  a  literature  and  tool  survey 
that  most  existing  reengineering  tools  are  limited  to 
reverse  engineering  and  redocumentation.  Language 
translation  tools  have  been  around  for  some  time,  but 
they  are  not  adequate  for  reengineering  [22] .  Few  tools 
provide  automated  help  with  restructuring  and  for¬ 
ward  engineering,  although  techniques  for  these  exist. 

We  have  developed  a  reengineering  process  model 
that: 

•  Is  consistent  with  previous  reengineering  process 
models 

•  Considers  software  life  cycle  issues  such  as  config¬ 
uration  management  and  testing 

•  Separates  completely  automated  restructuring 

ii.e.,  control  flow  restructuring)  that  can  be  per- 
brmed  as  preprocessing  from  restructuring  and 
redesign  that  requires  human  intervention  and  in¬ 
put 

•  Promotes  structured  programming  techniques, 
i.e.,  macro  restructuring  preceding  micro  restruc¬ 
turing 

•  Promotes  development  and  testing  of  a  function¬ 
ally  equivalent  program  before  undertaking  de¬ 
sign  changes  that  might  introduce  errors,  i.e.,  re¬ 
structuring  preceding  redesign. 


We  are  developing  a  reengineering  tool  (RET)  that 
implements  parts  of  the  process  model.  The  RET 
will  assist  in  reengineering  avionics  simulation  support 
software  written  in  FORTRA.N  to  Ada.  The  RET  em¬ 
phasizes  increasing  the  maintainability  of  the  software, 
preserving  its  functionality,  and  improving  the  struc¬ 
ture  of  the  code.  The  RET  does  not  concentrate  on 
configuration  management,  testing,  or  preliminary  re¬ 
structuring  because  there  are  many  commercial  tools 
that  can  be  used  in  conjunction  with  the  RET  to  pro¬ 
vide  those  capabilities. 

The  RET  will  relieve  the  engineer  from  syntacti¬ 
cal  minutia,  i.e.,  differences  between  the  source  and 
target  programming  language  synteix,  that  divert  at¬ 
tention  from  the  more  important  design  and  imple¬ 
mentation  decisions  requiring  human  judgement.  We 
believe  that  by  concentrating  on  tasks  that  are  well- 
suited  to  automated  support,  the  RET  will  reduce  the 
resources  needed  to  reengineer  avionics  support  soft¬ 
ware  and  will  help  the  human  engineer  produce  a  more 
maintainable  system. 
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Abstract 

With  the  increasing  reliance  on  computer  control 
of  embedded  real-time  systems  in  diverse  civilian  and 
military  applications  such  as  avionics,  air-traffic  con¬ 
trol,  patient  monitoring,  and  automated  manufactur¬ 
ing,  the  problem  of  re-engineering  an  aging  software 
base  will  confront  a  growing  number  of  organizations 
in  this  decade.  Facing  this  challenge  is  particularly 
important  because  of  the  economic  and  security  ram¬ 
ifications  of  maintaining  complex  embedded  systems 
that  operate  under  increasingly  strict  dependability 
and  timing  requirements.  After  discussing  some  of 
the  trends  in  the  field  of  computing  that  have  di¬ 
rectly  influenced  the  changing  requirements  on  the 
existing  systems,  this  paper  presents  a  re-engineering 
approach  based  on  the  formal  specification  of  a  sys¬ 
tem  through  decomposition  into  a  collection  of  ser¬ 
vices  with  well-defined  interfaces.  An  approach  based 
on  formal  specification  nicely  complements  the  devel¬ 
opment  of  a  toolkit  of  common  services  that  can  be 
reused  in  re-engineering  multiple  embedded  systems. 
Precise  specification  of  re-engineered  components  also 
supports  automated  tools  for  testing,  fault-injection, 
verification  and  run-time  monitoring. 

1  Introduction 

Complex  embedded  real-time  systems  are  being 
used  in  diverse  applications  such  as  avionics,  air-traffic 
control,  automated  manufacturing,  and  patient  mon¬ 
itoring.  With  the  increasing  reliance  on  digital  com¬ 
puters  in  monitoring  and  controlling  embedded  real¬ 
time  systems,  both  industry  and  government  organi¬ 
zations  are  faced  with  the  problem  of  extending  and 
modifying  an  “aging  software”  base.  Many  software 


systems  that  are  pivotal  to  the  operations  of  commer¬ 
cial  and  defense  systems  are  becoming  more  and  more 
costly  to  operate  [7].  Each  change  to  the  software  re¬ 
sults  in  one  or  more  patches  that  contribute  to  a  sys¬ 
tem  that  is  often  less  efficient,  less  reliable,  and  more 
difficult  to  maintain.  Consequently,  rt- engineering  ex¬ 
isting  software  is  a  significant  growing  challenge  that 
confronts  us  in  the  1990s  and  beyond. 

Re-engineering,  also  called  renovation  or  reclama¬ 
tion  in  the  literature  [1],  refers  to  the  recovery  of  de¬ 
sign  information  and  its  uses  to  alter  or  reconstitute 
the  existing  system  to  improve  its  quality  or  to  meet 
new  requirements.  The  increasing  importance  of  re¬ 
engineering  has  several  economic,  security  and  tech¬ 
nological  ramifications; 

•  The  economic  competitive  in  a  global  market  is 
directly  linked  to  the  thousands  of  aging  infor¬ 
mation  system  and  engineering  applications  that 
are  being  used  across  large  and  small  companies. 

•  Embedded  real-time  systems  often  have  strict  de¬ 
pendability  and  timing  requirements.  They  often 
control  and  monitor  defense  systems  as  well  as 
commercial  safety-critical  operations.  Interrup¬ 
tions  in  a  timely  delivery  of  services  may  have 
significant  national  security  or  economic  implica¬ 
tions. 

•  The  existing  software  base  is  the  infrastructure 
on  which  future  technology  is  built.  Achieving  a 
technological  edge  in  the  next  decade  is  depen¬ 
dent  on  the  strength  of  this  infrastructure. 

This  position  paper  argues  that  the  decomposition 
of  a  system  into  a  collection  of  services  (or  building 
blocks)  with  precise  specifications  and  well-defined  in- 


01 


terfaces  may  be  the  key  to  re-engineering  complex  em¬ 
bedded  systems.  '  The  paper  further  advocates  a 
toolkit  approach  to  re-engineering  of  these  systems: 
By  identifying  a  collection  of  distributed  services  com¬ 
mon  to  a  large  clciss  of  embedded  real-time  systems, 
these  building  blocks  can  be  reused  to  reduce  the  com¬ 
plexity  and  the  cost  of  modifying  and  extending  large 
systems.  We  illustrate  one  such  toolkit  that  was  de¬ 
veloped  at  IBM  Research  to  support  distributed  fault- 
tolerant  systems.  Finally,  the  paper  argues  that  an 
approach  based  on  formal  specification  and  precise 
interface  definitions  also  facilitates  the  development 
of  automated  tools  that  aid  in  demonstrating  that  a 
re-engineered  system  meets  the  new  requirements.  A 
brief  discussion  of  a  suite  of  tools,  currently  under  de¬ 
velopment,  for  testing,fault-injections,  verification  and 
run-time  monitoring  of  embedded  real-time  systems  is 
also  presented. 

2  Distinguishing  Characteristics  of  Em¬ 
bedded  Real-Time  Systems 

Embedded  real-time  systems  often  interact  with  the 
external  environment  and  operate  under  strict  timing 
and  dependability  requirements.  As  shown  in  Figure 
1,  an  embedded  real-time  system  can  be  decomposed 
into  three  components:  the  controlled  object,  the  com¬ 
puter  system,  and  the  operator.  The  controlled  ob¬ 
ject  and  the  operator  are  the  environment  of  the  sys¬ 
tem.  The  interface  between  the  real-time  computer 
system  and  the  controlled  object  is  called  the  instru¬ 
mentation  interface,  consisting  of  sensors  and  actua¬ 
tors.  The  interface  between  the  computer  system  and 
the  operator  is  called  the  man-machine  (or  the  oper¬ 
ator)  interface.  The  operator  monitors  and  controls 
the  object  via  this  interface  to  the  computer  system. 
Embedded  real-time  systems  are  in  essence  responsive: 
they  often  interact  with  the  environment  by  “reacting 
to  stimuli  of  external  events  and  producing  results, 
within  specified  timing  constraints”  [3].  To  guarantee 
this  responsiveness,  the  system  must  be  able  to  tol¬ 
erate  failures.  Hence,  a  fundamental  requirement  of 
fault-tolerant  real-time  systems  is  that  they  provide 
the  expected  service  in  a  timely  manner  even  in  the 
presence  of  faults. 

The  above  description  depicts  a  real-time  computer 
system  as  a  single  entity.  With  the  introduction  of  in¬ 
expensive  microprocessors  and  dense  memories  in  re- 

^A  more  detailed  version  of  this  paper  can  be  found  in 
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cent  years,  a  rapid  shift  from  large  centralized  com¬ 
puter  systems  to  clusters  of  closely-coupled  nodes  has 
taken  place.  In  fact,  most  real-time  computer  control 
systems  are  now  distributed  (Figure  2)  in  the  sense 
that  they  consist  of  a  set  of  nodes  interconnected  by  a 
real-time  communication  subsystem.  Conceptually,  a 
real-time  computer  system  is  providing  a  set  of  well- 
defined  services  to  the  environment.  These  services 
must  be  made  fault-tolerant  to  meet  the  availability 
and  reliability  requirements  on  the  entire  system. 

In  summary,  re-engineering  embedded  software  is 
complicated  by  three  characteristics  of  real-time  sys¬ 
tems: 

1.  Timing  Constraints:  The  system  must  provide 
timely  service  even  in  the  presence  of  faults.  The 
correctness  of  a  computation  is  dependent  not 
only  on  the  correctness  of  its  results,  but  also  on 
meeting  stringent  timing  requirements. 

2.  Dependability  Requirements:  Embedded  real¬ 
time  systems  often  have  strict  availability  and 
reliability  requirements.  Failure  of  a  system  may 
result  in  catastrophic  loss  of  life  or  property. 

3.  Interaction  with  Environment:  A  real-time  com¬ 
puter  system  reacts  to  stimuli  from  the  external 
environment.  The  interaction  with  the  external 
world  makes  the  goal  of  meeting  strict  timing  and 
dependability  requirements  more  difficult. 

3  Changing  Requirements  and  the  Search 
for  a  Silver  Bullet 

The  ever-changing  requirements  on  the  existing  sys¬ 
tems  is  often  mentioned  as  a  source  of  continuing  mod¬ 
ifications  and  extensions  to  a  software  system.  These 
changes  are  sometimes  needed  because  the  original  de¬ 
sign  was  not  robust  enough  or  because  new  software 
bugs  (or  features)  are  discovered.  However,  as  sug¬ 
gested  in  the  previous  paragraph,  several  technological 
trends  in  the  computing  field  have  contributed  to  the 
changing  requirements  on  our  existing  software-based 
applications.  These  trends  include: 

•  Shift  from  centralized  systems  to  distributed 
computing, 

•  Shift  from  monolithic  operating  systems  to  mi¬ 
crokernels, 


Figure  1:  An  Embedded  Real-Time  System. 
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Figure  2:  A  Distributed  Real-Time  Computer  System. 


•  Shift  from  proprietary  hardware/software  com¬ 
ponents  to  inter-operability  of  systems. 

•  Significant  improvement  in  cost/performance  ra¬ 
tio  for  processors,  memory,  and  storage  devices, 
and 

•  Shift  from  human-control  to  computer-control  of 
embedded  systems. 

These  dramatic  trends  in  the  computing  filed  have 
contributed  significantly  to  the  increasing  changes  of 
the  requirements  on  the  existing  embedded  systems. 
Hence,  we  are  faced  with  an  aging  software  base 
that  must  be  modified,  extended,  or  in  some  cases 
completely  redesigned  and  re-implemented.  However, 
there  is  no  silver  bullet.  The  cost  of  re-engineering 
is  often  very  high  and  the  existing  tools  are  in  most 
cases  very  primitive:  tools  for  extracting  syntactic  or 
structural  information  are  available  where  as  tools  for 
capturing  semantic  information  are  rare. 

Re-engineering  a  complex  embedded  system  be¬ 
comes  manageable  only  by  a  careful  decomposition  of 
the  system  into  a  collection  of  services  with  precise 


interfaces  that  are  preserved  from  one  version  to  an¬ 
other.  Decomposition  of  a  system  into  a  collection  of 
formally  specified  services  also  lends  itself  to  the  reuse 
of  software  in  different  systems:  by  building  toolkits  of 
services  that  capture  common  functions  in  a  range  of 
embedded  real-time  systems,  one  can  reduce  the  cost 
of  re-engineering  through  the  reuse  of  software.  We 
elaborate  on  this  point  in  the  following  sections. 

4  Precise  Specification  and  Well-Defined 
Interface?^ 

A  key  to  re-engineering  complex  embedded  software 
systems  is  the  capturing  of  important  attributes  of  a 
system  at  an  appropriate  level  of  abstraction.  This 
includes  capturing  both  the  requirements  specification 
and  the  key  design  attributes  of  the  system.  At  one  ex¬ 
treme,  an  existing  implementation  itself  can  be  viewed 
as  a  concrete  specification  of  the  system.  However, 
this  information  is  far  too  detailed  and  it  exposes  much 
of  the  implementation  that  may  not  carry  over  to  the 
re-engineered  system.  At  the  other  extreme,  an  infor¬ 
mal  (English)  requirements  specification  of  the  system 
can  be  viewed  as  a  high-level  description.  The  prob- 
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lem  here  is  the  informality  of  the  specification  and  the 
lack  of  precise  semantics.  Hence,  specification  tools 
are  needed  that  aid  the  software  engineer  in  capturing 
system  requirements  and  key  features  of  an  existing 
design. 

An  issue  related  to  the  specification  of  requirements 
and  design  attributes  is  the  precise  definition  of  inter¬ 
nal  and  external  interfaces  for  a  system  that  is  being 
re-engineered.  A  complete  or  a  partial  re-engineering 
of  a  system  is  feasible  if  we  view  an  embedded  sys¬ 
tem  as  a  collection  of  building  blocks  (or  services) 
with  well-defined  external  interfaces.  Each  service,  in 
turn,  may  be  implemented  using  other  services.  This 
model  naturally  supports  compo-sition  of  more  prim¬ 
itive  building  blocks  so  that  more  complex  services 
are  constructed.  This  is  the  same  philosophy  that 
guided  the  initial  design  of  the  new  generation  of  the 
air-traffic  control  system  [2].  Localizing  the  modifica¬ 
tions  or  extensions  becomes  manageable  if  there  is  a 
precise  functional  specification  of  each  building  block 
with  well-defined  interfaces.  As  discussed  in  the  sec¬ 
tion  6,  a  precise  and  formal  specification  is  also  helpful 
in  supporting  automated  tools  for  testing,  verification 
and  run-time  monitoring  of  a  re-engineered  system. 

5  Reuse  of  Building  Blocks  in  Re- 
Engineering  Embedded  Systems 

The  reuse  of  software  can  provide  additional  lever¬ 
age  in  re-engineering  embedded  systems.  Since  we  will 
be  faced  with  an  increasing  number  of  complex  sys¬ 
tems  that  will  require  partial  or  complete  redesign  and 
re-implemented,  it  makes  sense  to  develop  a  collection 
of  building  blocks  that  are  reused  in  different  systems. 
By  capturing  some  of  the  changing  requirements  that 
are  common  to  a  large  collection  embedded  real-time 
systems,  one  can  develop  a  set  of  services  with  well- 
defined  interfaces  that  are  used  in  re-engineering  large 
systems.  One  can  view  this  approach  as  a  toolkit  ap¬ 
proach  to  re-engineering. 

As  mentioned  earlier,  the  shift  toward  distributed 
computing  and  the  emphasis  on  open  systems  are 
among  the  most  important  factors  contributing  to 
the  changing  requirements  on  the  existing  software 
base.  Hence,  a  collection  of  distributed  fault-tolerant 
services  that  can  be  used  as  building  blocks  in  re¬ 
engineering  real-time  systems  may  prove  to  be  a  logi¬ 
cal  first  step.  In  the  following  subsection,  we  introduce 
a  testbed  that  was  built  in  IBM  Research  for  devel¬ 
oping  fault-tolerant  servers[6].  A  new  version  of  this 


architecture  specifically  designed  for  real-time  systems 
is  currently  under  development  at  the  University  of 
Michigan. 

5.1  Toolkit  of  Services:  An  Example 

The  testbed  consists  of  a  collection  of  protocols 
for  managing  replicated  and  distributed  resources  in 
a  system.  It  consists  of  six  software  layers,  each  ex¬ 
porting  a  well-defined  interface  to  the  other  layers  or 
to  applications  that  are  built  on  top  of  the  testbed. 
Figure  3  illustrates  the  software  layers  in  the  testbed. 
Each  software  layer,  referred  to  as  a  service,  supports 
one  or  more  protocols.  A  brief  description  of  each 
service  layer  follows: 

•  multicast  communication  service:  provides  a  reli¬ 
able  datagram  communication  service  for  sending 
a  message  to  a  collection  of  destinations.  This 
service  allows  the  exploitation  of  available  com¬ 
munication  protocols  (e.g.,  Netbios  vs.  UDP) 
and  possible  hardware  support  (e.g.,  hardware 
broadcast  facility)  on  a  given  system  without  ex¬ 
posing  the  implementation  to  the  higher  layer 
services. 

•  processor  membership  and  monitoring  service: 
provides  a  consistent  view  of  the  operational  sta¬ 
tus  of  a  group  of  processors  in  the  presence  of 
processor/process  failures/joins  and  communica¬ 
tion  failures.  Three  membership  protocols  with 
varying  degrees  of  consistency  in  the  views  of  the 
members  are  supported. 

•  clock  synchronization  service:  provides  a  bound 
on  the  deviation  between  logical  clocks  on  pro¬ 
cessors  in  the  presence  of  hardware  clock  drifts 
and  failures. 

•  reliable  naming  service:  provides  a  reliable  ser¬ 
vice  mapping  the  name  of  an  object  to  a  list  of 
processors  in  the  system.  This  layer  supports 
multiple  namespaces. 

•  distributed  cache  service:  pro¬ 

vides  shared/exclusive  access  to  remote  objects 
with  local  caching.  This  layer  supports  multiple 
coherency  protocols  including  cache  invalidation 
and  write-through  policies. 

•  replication  service:  provides  a  mechanism  for 
maintaining  multiple  copies  of  objects  in  a  clus- 
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ter.  This  layer  supports  several  replication  proto¬ 
cols  with  different  consistency  semantics  for  up¬ 
dating  replicas. 

•  distributed  synchronization  service:  provides 
fault-tolerant  and  scalable  synchronization  pro¬ 
tocols  for  serializing  access  to  shared/exclusive 
resources.  The  distributed  synchronization  ser¬ 
vice  can  recover  from  the  failure  of  a  lock 
holder/coordinator  and  communication  failures. 

•  service  failure  detectoi:  provides  notifica¬ 

tion/query  service  for  monitoring  status  changes 
of  a  collection  of  subsystems  grouped  together  as 
a  server.  A  status  change  to  a  group  can  occur 
because  of  a  subsystem  failure  or  an  update  to 
an  application-defined  status  field. 

The  above  software  services  are  the  building  blocks 
from  which  much  larger  systems  can  be  developed. 
These  building  blocks  can  be  used  to  replace  partially 
or  completely  a  system  component  that  is  being  re¬ 
engineered.  The  objective  is  to  develop  sufficiently 
general  building  blocks  once  and  to  reuse  them  in  ex¬ 
tending  and  modifying  existing  software  systems. 

6  Testing,  Verification,  and  Monitoring 
Tools 

One  can  view  the  re-engineering  of  existing  soft¬ 
ware  as  a  magic  trick.  The  existing  system  and  the 
new  requirements  are  fed  into  this  complicated  pro¬ 
cess,  and  the  output  is  a  new  system  that  meets  these 
requirements.  However,  ensuring  that  the  new  sys¬ 
tem  meets  the  imposed  requirements  is  an  important 
problem  that  must  be  addressed.  Embedded  real-time 
systems  have  strict  timing  and  dependability  require¬ 
ments.  Rigorous  methods  must  be  applied  to  demon¬ 
strate  that  the  system  meets  these  constraints. 

Decomposition  of  a  system  into  a  set  of  services 
with  precise  specifications  and  clean  interfaces  pro¬ 
vides  an  advantage  in  achieving  this.  We  advocate 
a  three-way  approach  in  demonstrating  that  a  re¬ 
engineered  system  meets  its  specification: 

1.  Testing  and  fault-injection  methods 

2.  Formal  verification  methods 

3.  Run-time  monitoring  techniques 


Testing  and  fault-injection  techniques  exercise  nu¬ 
merous  execution  paths  in  the  new  system  to  de¬ 
tect  design  and  implementation  faults.  Fault-injection 
techniques  are  used  to  test  the  fault-tolerance  capa¬ 
bilities  of  the  new  protocols  and  the  robustness  of 
the  system  with  respect  to  failures.  Formal  verifica¬ 
tion  techniques  are  effective  in  proving  the  correctnesa 
of  an  implementation  with  respect  to  a  specification. 
These  techniques  will  be  effective  in  ensuring  the  cor¬ 
rectness  of  certain  critical  tasks  in  a  system.  Formal 
verification  complements  testing  and  fault-injection  by 
demonstrating  that  critical  operations,  for  example, 
meet  certain  safety  assertions.  Testing  and  formal  ver¬ 
ification  techniques  may  not  guarantee  against  viola¬ 
tion  of  design  assumptions  and  unpredictable  behav¬ 
ior  of  the  external  environment.  Hence,  monitoring 
of  a  system  is  necessary  to  detect  violation  of  safety 
properties  and  design  assumptions  at  run-time[5,  8]. 
It  should  be  noted  that  a  formal  specification  of  a 
system  and  the  interfaces  of  its  building  blocks  are 
important  for  developing  automated  tools  for  testing, 
verification,  and  monitoring.  We  are  currently  inves¬ 
tigating  a  collection  of  automated  tools  for  testing, 
fault-injection,  verification  and  monitoring  that  aid  a 
software  engineer  in  modifying  and  extending  existing 
software. 

7  Conclusion 

Re-engineering  embedded  real-time  systems  is  an 
important  challenge  that  confronts  us  in  the  com¬ 
ing  decade.  With  the  increasing  reliance  on  digi¬ 
tal  computers  for  monitoring  and  controlling  embed¬ 
ded  systems,  the  problem  of  modifying  and  extend¬ 
ing  an  existing  software  base  that  meets  new  require¬ 
ments  is  a  key  issue.  The  changing  requirements  on 
the  existing  systems  have  been  influenced  by  several 
trends  in  the  computing  field  including  the  shift  to  dis¬ 
tributed  computing,  the  need  for  inter-operability  of 
hardware/software  components,  and  the  exploitation 
microkernel  based  O.S.  Strict  timing  and  dependabil¬ 
ity  requirements  introduce  additional  complexity  in 
the  re-engineering  of  embedded  real-time  systems. 

This  paper  advocated  that  the  decomposition  of  a 
system  into  a  collection  of  services  with  precise  spec¬ 
ification  and  well-defined  interfaces  is  crucial  in  re¬ 
ducing  the  complexity  of  re-engineering  large  systems. 
Furthermore,  by  developing  a  collection  of  building 
blocks  (or  services)  that  can  be  reused  in  different 
systems,  the  cost  of  extending  and  modifying  exist¬ 
ing  systems  may  become  more  acceptable.  However, 
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Figure  3:  The  Software  Layers  of  the  Testbed. 


these  building  blocks  must  contain  a  common  collec¬ 
tion  of  services  that  may  be  needed  when  addressing 
the  changing  requirements  of  a  large  subset  of  em¬ 
bedded  real-time  systems.  Finally,  this  approach  is 
amenable  to  the  development  of  a  collection  of  tools 
for  ensuring  that  the  new  requirements  are  satisfied 
by  a  re-engineered  system. 
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Background 

The  AN/UYK-7  Computer  Set  is  a  Navy 
standard  computer  originally  fielded  in  the  late  1970’s  and 
manufactured  by  Unisys.  The  computer  set  is  configured 
with  a  variety  of  modules  such  as  the  memory  module  or 
the  Central  Processing  Unit  (CPU)  module.  AN/UYK-7 
computers  are  used  across  a  wide  array  of  platforms  with 
a  population  of  hundreds  of  units. 

The  MK  86  Gun  Fire  Control  System  (GFCS) 
uses  the  AN/UYK-7(V)  Computer  Set  for  a  variety  of 
functions.  The  MK  86/UYK-7  computer  as  it  exists 
today  faces  several  memory  problems.  Plaimed  upgrades 
in  targeting  radar  systems  require  more  memory  and 
faster  access  to  provide  a  performance  increase  in 
computer  execution  of  new  application  software.  The 
existing  memory  units  available  for  use  in  the  AN/UYK- 
7(V)  Computer  Set  are  inadequate  for  these  upgrades. 
Additionally,  the  existing  memory  units  face  severe 
obsolescence  problems.  The  mated  film  technology 
which  forms  the  basis  of  the  Double  Density  Mated  Film 
Memory  Unit  (DDMFM)  is  no  longer  in  production.  The 


day  is  rapidly  approaching  when  the  UYK-7  as  it  exists 
will  not  support  MK  86  GFCS  requirements.  The  MK  86 
GFCS  In  Service  Engineering  Agent  (ISEA)  at  Port 
Hueneme  Division  (PHD)  of  the  Naval  Surface  Warfare 
Center  (NSWC)  recognized  the  approaching  problem  in 
1991  and  took  steps  to  counter  it.  NSWC  Crane  Division 
was  enlisted  to  provide  product  engineering  support  to 
develop  a  technology  upgrade  for  the  MK  86  UYK-7 
Computer  Set.  In  an  age  of  ever-decreasing  defense 
budgets,  there  is  more  and  more  pressure  to  keep  existing 
equipments  upgraded  and  fielded.  This  paper  describes 
the  project  to  develop  an  Enhanced  Memory  Unit  (EMU) 
for  use  in  the  AN/UYK-7(V)  Computer  Set  and  the 
features  of  said  unit.  The  EMU  contains  an  embedded 
state-of-the-art  processor  which  essentially  gives  the 
AN/UYK-7  computer  access  to  new  found  powers.  In 
other  words,  we’ve  jacked  the  computer  up  and  put  a 
powerful  new  engine  under  it. 

Teaming  Arrangement 

The  project  is  a  teaming  arrangement  between 
NSWC  Port  Hueneme  Division  (PHD),  NSWC  Crane 
Division  and  PMS-412  UYK-7  Program  Office.  NSWC 
Crane  is  developing  the  MK  86/UYK-7  Enhanced 
Memory  Unit  (EMU)  and  will  undertake  the  MK  86 
production  run  of  the  EMU.  PHD  is  writing  the 
application  software  for  use  with  the  MK  86  upgrades  and 
will  perform  the  ORDALT  to  install  the  EMU  into  MK 
86/UYK-7  computer  sets.  PHD  is  also  responsible  for 
overall  project  direction.  PMS-412  is  providing  physical 
assets  for  refurbishment  and  effecting  the  field  change  to 
the  UYK-7  computer  set. 

Goals  &  Objectiyes 

The  original  goal  of  the  project  was  to  develop 
a  semiconductor  memory  replacement  for  the  UYK-7 
Double  Density  Mated  Film  Memory  (DDMFM).  The 
Enhanced  Memory  Unit  was  to  have  4  times  the  memory 
and  30%  performance  enhancement.  A  further  goal  was 
to  provide  units  to  meet  the  scheduled  ship  installation  of 
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other  MK  86  upgrades.  During  the  course  of  the  project, 
the  goals  changed  to  attempt  to  develop  a  unit  which 
could  provide  a  solution  for  MK  86/UYK-7  requirements 
into  the  next  century. 

AccompUshments 

All  of  the  original  design  goals  will  be  met  or 
exceeded  by  the  EMU  which  has  been  developed  with 
major  improvements  in  power  consumption  and  MTBF. 
More  significantly,  the  EMU  will  provide  performance 
enhancements  of  approximately  1000%  or  better  through 
a  technology  insertion.  The  technology  insertion  is  the 
use  of  an  embedded  state-of-the-art  processor,  the 
M68040.  A  simple  drawer  replacement  (slide  out  the  old 
unit  and  slide  in  the  new  EMU)  will  give  the  UYK-7 
state-of-the-art  processing  capabilities. 

The  project  is  currently  on  schedule  and  within 
budget.  The  first  Engineering  Developmental  Model 
(EDM)  was  delivered  in  November  1993.  According  to 
a  study  by  PHD,  it  is  anticipated  that  this  systems  re¬ 
engineering  technological  solution  will  result  in  a  S60M 
cost  avoidance. 

Finally,  it  is  anticipated  that  the  EMU  will 
provide  an  excellent  option  for  other  UYK-7  users  faced 
with  the  same  DDMFM  obsolescence,  lack  of  memory 
and  needed  performance  enhancements  which  prompted 
MK  86  to  initiate  this  project.  PMS-412  estimates  that  as 
many  as  800  EMU’s  could  be  needed  by  the  UYK-7 
community. 

Technical  Features 

The  Enhanced  Memory  Unit  (EMU)  had  many 
design  goals.  The  first  was  to  remain  form,  fit,  and 
functionally  equivalent  to  the  current  Double  Density 
Mated  Film  Memory  (DDMFM)  which  is  a  32K  x  32 
memory  module  used  in  the  UYK-7  Computer  Set.  This 
would  lead  to  easy  installation  of  the  EMU  on  board 
ships.  The  second  goal  was  to  increase  memory  capacity 
to  128K  X  32  in  a  memory  module  so  that  single  and  dual 
bay  configurations  of  the  UYK-7  could  reach  full  memory 
ctqiacity  without  adding  more  cabinets.  The  third  goal 
was  to  increase  the  performance  of  the  UYK-7  by 
decreasing  the  memory  access  time.  Early  tests  showed 
that  only  marginal  performance  increases  could  be 
obtained  with  decreased  access  time,  so  a  processor  was 
added  to  the  memory  module  to  offload  computational 
processing  tasks  from  the  UYK-7  CPU. 

The  choice  of  the  processor  v'as  quickly 
narrowed  to  the  Motorola  68000  family  since  the  current 
compiler  for  the  UYK-7,  CMS2K,  presently  supports  the 
Motorola  68(X)0  series  processors.  Since  most  of  the 


computational  tasks  to  be  offloaded  and  future  upgrades 
require  floating  point  calculations,  an  integral  floating 
point  processor  was  desired.  With  these  considerations, 
the  Motorola  68040  became  the  choice. 

The  Enhanced  Memory  Unit  (EMU)  is  composed 
of  5  main  functional  areas:  128K  x  32  non- volatile 
memory  anay,  8  access  channels,  control  section,  power 
distribution,  and  EMU  Processor  unit.  The  block 
diagram  for  the  EMU  is  shown  in  Figure  1. 

The  memory  array  is  formed  of  8K  x  8  non¬ 
volatile  static  RAM  microcircuits.  These  microcircuits 
consist  of  a  fast  static  RAM  section  with  a  shadow 
EEPROM  section.  During  normal  operation,  the  memory 
operates  from  the  static  RAM  section  providing  fast 
access  for  both  read  and  write  operations.  The  contents 
of  the  sutic  RAM  aie  stored  in  EEPROM  when  the  input 
power  is  detected  to  be  below  normal  operating  voltage. 
The  time  required  to  make  this  transfer  is  12 
milliseconds.  Therefore,  the  internal  voltages  generated 
in  the  EMU  must  be  maintained  during  this  time.  This  is 
accomplished  by  having  a  large  diode  isolated  capacitor 
on  the  input  of  the  EMU  power  converter.  On  power  up, 
the  contents  of  the  EEPROM  are  transferred  to  the  static 
RAM  section.  The  lime  required  to  do  this  is  25 
microseconds  which  occurs  after  the  EMU  power 
converter  reaches  normal  operating  voltage  and  while  the 
UYK-7  master  reset  is  still  active.  This  gives  the 
memory  unit  its  non-volatile  appearance  and  does  not 
affect  any  of  the  normal  operating  time  of  the  UYK-7. 
The  EEPROM  section  has  the  same  characteristics  of  any 
other  EEPROM  which  has  a  minimum  of  10,000  write 
cycles.  Since  wntes  to  the  EEPROM  only  occur  during 
power  removal,  this  will  provide  several  years  of  failure 
free  non-volatile  operation. 

Eight  access  channels  are  provided  to  interface 
the  memory  unit  to  the  UYK-7  CPU  and  Input  Output 
Controller  (IOC).  The  CPU  requires  two  interface 
channels;  one  to  fetch  instructions  and  one  to  obtain 
operands.  The  iOC  requires  only  one  interface  channel. 
Any  combination  of  CPUs  and  lOCs  can  be  interfaced  to 
a  memory  unit  within  the  limits  of  8  channels.  The  eight 
access  channels  share  the  memory  array  through  a  simple 
channel  ranking  priority  arbitration  scheme.  This  means 
the  highest  ranking  chaimel  with  an  active  channel  request 
will  get  the  next  available  memory  cycle.  The  EMU 
contains  4  tim«,»  the  available  memory  of  the  DDMFM 
which  means  that  the  EMU  must  provide  faster  access 
time  to  provide  equivalent  performance  of  the  DDMFM 
in  certain  configurations.  The  DDMFM  provided  a  single 
cycle  read  access  time  of  1100  nanoseconds  with  a 
memory  cycle  time  of  750  nanoseconds.  The  EMU  was 
designed  with  a  single  read  access  time  of  750 
nanoseconds  with  a  memory  cycle  time  of  150 
nanoseconds.  The  750  nanosecond  single  cycle  access 
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time  approaches  the  limit  of  the  UYK-7  CPU.  A  single 
EMU  provides  better  performance  than  two  DDMFMs 
used  in  a  software  interleaved  configuration. 

The  control  section  manages  the  data  flow 
through  the  memory  unit.  The  control  unit  is  divided  into 
three  functional  blocks.  The  first  is  the  local  controller. 
The  local  controller  handles  all  the  control  signals  from 
the  UYK-7  required  for  the  transfer  of  information  over 
one  channel.  There  is  one  local  controller  for  each 
channel.  The  second  functional  block  is  the  priority 
arbiter  which  determines  which  of  the  requesting  chaimels 
will  get  the  next  memory  cycle.  The  memory  controller 
is  the  third  major  control  function.  Tne  memory 
controller  provides  the  timing  and  the  decoding  required 
to  access  the  memory  array. 

The  power  distribution  section  generates  the  S 
volt  power  required  by  the  EMU  logic  from  the  -90  volt 
DC  provided  f^rom  the  UYK-7  power  supply.  Detection 
logic  determines  when  the  incoming  power  is  going  out 
of  tolerance  and  generates  the  timing  control  signals 
required  to  perform  the  store  from  static  RAM  to 
EEPROM.  A  diode  isolated  capacitor  provides  the  energy 
required  to  perform  the  store  after  incoming  power  has 
been  removed.  After  the  store  operation  is  complete,  the 
EMU  power  converter  is  shut  off  to  prevent  undesired 
store  operation  as  the  capacitor  slowly  discharges. 

The  EMU  Processor  unit  was  designed  to  offload 
processing  tasks  from  the  UYK-7  CPU.  Independent 
processing  tasks  can  be  downloaded  from  the  UYK-7 
CPU  to  the  EMU  Processor.  The  UYK-7  CPU  initiates 
tasks  providing  input  parameters  and  receives  processed 
results.  Communication  from  the  UYK-7  CPU  to  the 
EMU  processor  is  accomplished  through  shared  memory 
which  is  the  EMU  Memory  Array.  The  EMU  Processor 
has  no  access  to  any  external  peripherals  of  the 
AN/UYK-7. 

The  EMU  processor  unit  consists  of  a  68040  that 
has  local  memory  which  is  128K  x  32  words  of  volatile 
static  RAM  and  64K  x  32  words  of  EEPROM  as  shown 
in  Figure  2.  The  static  RAM  section  is  used  to  store 
tasks  loaded  from  the  UYK-7  CPU  and  to  set  up  data 
buffers  to  be  used  during  the  processing  of  tasks.  The 
EEPROM  section  contains  three  program  functions;  the 
floating  point  code  provided  by  Motorola  to  provide  full 
function  floating  point  processor,  built-in-test  routine  for 
checking  the  EMU  Processor  module,  and  a  monitor 
routine  to  allow  for  initial  download  and  start  of  tasks. 
The  EMU  processor  module  is  connected  to  the  address 
and  data  buses  of  the  EMU  memory  array  which  gives 
the  68040  access  to  the  entire  EMU  memory  array  as 
shown  in  Figure  1. 

The  upper  eight  memory  locations  of  the  EMU 
Memory  Array  are  reserved  for  control  of  the  EMU 
Processor.  These  memory  locations  are  called  the  EMU 


Processor  Control  Registers  although  they  are  dedicated 
memory  locations  with  additional  decoding  circuitry  for 
special  functions.  These  registers  provide  a  Control 
Register,  a  Status  Register,  a  Software  Reset  Register,  a 
UYK-7  Interrupt  Register,  and  a  Built-in-Test  Register. 

The  Control  Register  allows  the  UYK-7  CPU  to 
send  commands  to  the  EMU  Processor  Monitor  Software. 
When  the  UYK-7  CPU  writes  to  this  register,  an  interrupt 
is  generated  to  the  68040.  The  68040  can  read  the 
register  and  perform  the  required  function. 

The  Status  Register  is  used  by  the  EMU 
Processor  to  provide  status  information  to  the  UYK-7 
CPU.  The  infcimation  passed  in  the  Status  Register  can 
indicate  such  things  as  the  EMU  Processor  being 
initialized,  a  valid  command  being  received,  and  the 
status  of  a  completed  task. 

The  Software  Reset  Register  allows  the  UYK-7 
CPU  to  reset  the  EMU  Processor.  The  reset  performed 
is  equivalent  to  a  power-up  reset. 

The  UYK-7  Interrupt  Register  provides  a  means 
of  signalling  the  UYK-7  CPU  when  the  EMU  Processor 
completes  a  task.  When  the  EMU  Processor  writes  to  the 
UYK-7  Interrupt  Register,  a  Class  II  Inteqrrocessor 
interrupt  is  generated  on  the  UYK-7  CPU.  This  feature 
requires  that  a  wire  be  added  on  the  UYK-7  backplane 
from  the  EMU  module  to  the  UYK-7  CPU  module. 

The  Built-in-Test  Register  is  reserved  for  use 
with  the  EMU  Processor  built-in-test  software.  The  built- 
in-test  software  uses  this  register  to  test  the  interface  from 
the  EMU  Processor  to  the  EMU  Memory  Array.  The 
built-in-test  software  on  the  EMU  Processor  tests  the 
68040  functions,  the  RAM  and  EEPROM  on  the  EMU 
Processor  module,  and  the  interface  to  the  EMU  Memory 
Array.  The  built-in-test  does  not  test  any  other  function 
of  the  EMU.  This  check  out  is  performed  with  the 
standard  UYK-7  Diagnostics. 

To  allow  the  UYK-7  Diagnostics  to  be  executed 
without  change,  an  enable  feature  was  designed  into  the 
EMU  Processor.  When  the  EMU  is  powered  up,  the 
EMU  Processor  is  not  enabled.  The  EMU  Processor  can 
not  access  the  EMU  Memory  Array  and  will  not  respond 
to  commands  from  the  EMU  Processor  Control  Registers. 
Therefore  the  EMU  Processor  Control  Registers  respond 
the  same  as  normal  memory  locations.  This  allows  the 
UYK-7  Diagnostics  to  run  unchanged.  The  EMU 
Processor  is  enabled  by  writing  a  coded  sequence  to  the 
Status  Register  memory  location.  This  enables  all  the 
functions  of  the  EMU  Processor.  The  EMU  Processor  is 
disabled  by  any  reset  including  the  software  reset. 

As  a  result  of  the  redesign  effort,  two  other 
improvements  were  obtained  for  the  UYK-7  Computer 
Set:  decreased  power  consumption  and  an  increased 
reliability  over  the  DDMFM.  The  present  UYK-7  system 
has  marginal  cooling  capacity.  Some  long  term  failure 
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problems  have  been  attributed  to  inadequate  cooling.  The 
present  DDMFM  dissipates  250  Watts.  In  a  single  bay 
configuration,  there  can  be  up  to  three  DDMFMs.  One 
EMU  has  the  memory  capacity  of  four  DDMFMs  and  has 
a  measured  power  dissipation  of  80  Watts.  If  no  memory 
capacity  increase  is  desired,  this  offers  up  to  a  670  Watt 
power  dissipation  saving  in  this  one  cabinet  and  provides 
additional  cooling  capacity  for  the  remaining  modules  in 
the  cabinet. 

The  DDMFM  has  one  of  the  highest  failure  rates 
of  the  modules  in  the  UYK-7  Computer  Set  with  the 
primary  failures  being  the  mated  film  stack  and  the 
associated  drivers.  The  EMU  is  designed  with  all  solid 
state  devices  which  have  inherently  lower  failure  rates. 
The  calculated  MTBF  per  MIL-STD-217E  is  8600  hours 
which  is  approximately  4  times  better  than  the  actual 
failure  rate  of  the  DDMFM.  With  the  fact  that  fewer 
modules  are  required  for  the  same  memory  capacity,  this 
offers  a  considerable  reliability  improvement  for  the 
UYK-7  Computer  Set. 

Software  Re-engjneeriiig 

In  the  late  1960’s,  the  original  MK  86 
operational  program  was  written  in  assembly  language 
and  hosted  in  a  MK  152  (UNIVAC  1219B)  computer  with 
32k  of  18  bit  memoiy.  In  the  early  1970’s,  as  MK  86 
outgrew  the  MK  152  computer,  efforts  began  to  translate 
the  program  into  UYK-7  code.  The  product  that  resulted 
was  very  much  like  the  MK  152  program  with  changes 
made  to  utilize  the  UYK-7  features  and  new  system 
hardware.  The  significant  computer  improvements  of  that 
era  included  the  faster  CPU,  memoiy,  and  an  independent 
Input/Output  Controller,  In  accordance  with  U.S.  Navy 
policy  to  use  standardized  "high  order  languages,"  MK  86 
began  coding  system  improvements  in  CMS-2.  Developed 
in  the  late  1960’s,  CMS-2Y  was  a  "Compiler  Monitor 
System"  for  the  32  bit  instruction  set  architecture 
machines.  Compilations  were  performed  on  the  UYK-7 
itself. 

As  a  language,  CMS-2  has  continued  to  be 
updated,  and  is  now  supported  on  platforms  other  than  the 
UYK-7  system.  MK  86  now  uses  the  Navy’s  standard  set 
of  support  software  tools  called  "Machine  Transferrable 
And  Support  Software  (MTASS)"  hosted  on  a  VAX  to 
perform  life  cycle  software  support  activities. 

In  order  to  discuss  the  re-engineering  of  our 
product,  the  structure  of  the  current  MK  86  operational 
program  is  depicted  in  Figure  3.  There  is  a  basic  one 
second  cycle  of  events  that  must  be  completed  to  effect 
system  operation.  The  one  second  window  is  broken 
down  into  16  segments,  each  having  specific  functions  to 
complete  which  support  other  "sixteenths"  tasks.  When 


four  identical  tasks  are  performed  during  each  sixteenth, 
the  task  is  said  to  be  a  "64  hz"  task.  64  Hz  processing  is 
initiated  by  the  IOC  Monitor  Clock.  While  the  major 
Executive  cycle  executes  at  a  16  Hz  rate,  there  are  also 
a  number  of  tasks  which  execute  at  rates  of  8  Hz,  4Hz, 
2  Hz,  and  1  Hz.  These  are  referred  to  as  "Variable 
Sequences"  and  consist  of  a  number  of  MK  86  functions 
which  must  complete  during  an  allocated  amount  of  time, 
but  need  not  be  executed  at  a  16  Hz  rate. 

Lx>oking  forward  to  operational  software 
integration  with  the  EMU,  the  basic  program  cycle  will 
not  change.  The  64  hz,  16  hz,  etc.  processing  will 
continue.  Our  challenge  is  to  off-load  tasks  that  the 
UYK-7  struggles  with  onto  the  EMU  Processor.  The 
EMU  drawer  will  function  as  a  two  port  UYK-7  memory 
with  the  M68040  processor  providing  high  speed 
execution  and  "mathpac"  capabilities.  The  M68040 
integer  and  floating  point  capabilities  are  extensive.  In 
anticipation,  we  have  begun  recoding  mathematically 
intense  operations  such  as  coordinate  conversion  routines, 
various  predictors,  and  stabilization  subprograms  to  utilize 
the  EMU  processor’s  power.  It  will  be  very  interesting  to 
compare  M68040  performance  with  UYK-7  and  MK  152 
data  where  trigonometric  functions  were  computed 
utilizing  look-up  tables.  For  example,  early  MK  86 
project  notes  indicate  that  a  MK  152-hosted  sine  or  cosine 
function,  with  "14  1/2  bit  accuracy"  could  be 
accomplished  in  as  little  as  400  microseconds!  Recent 
investigations  measure  the  same  basic  MK  86  algorithm, 
executing  in  the  UYK-7,  at  87  microseconds.  Given  the 
EMU  processor’s  integer  and  floating  point  computational 
power,  we  fully  expect  to  reduce  our  math  execution 
times  by  an  order  of  magnitude. 

A  depiction  of  the  new  structure  is  shown  in 
Figure  4.  All  standard  processes  will  be  shared  with  the 
processor  to  the  greatest  extent  possible.  This  will  allow 
Variable  Sequence  processes  to  be  performed  earlier  (as 
required).  The  UYK-7  will  have  more  time  available  to 
perform  additional  background  processes.  This  will  allow 
for  improvements  in  on-line  data  reduction  and  analysis, 
for  example.  Certain,  rigid  processes  will  remain 
unchanged. 

With  regard  to  language,  a  major  factor  in 
selecting  the  EMU  processor  was  the  ability  to  compile 
existing  CMS-2  code  with  a  new  target  in  mind.  The 
MTASS  CMS-2K  compiler  supports  the  M68030  target 
processor  as  utilized  in  the  AN/UYK-43’s  "Time  Critical 
Subfunction  (TCS)".  Therefore,  we  felt  secure  in 
selecting  the  M68040  which  fully  supports  the  repertoires 
of  the  M68030  and  M68882  coprocessor  -  only  faster  and 
more  efficiently.  In  the  development  of  a  computer 
program  which  could  be  executed  by  the  EMU  processor, 
we  discovered  a  problem.  We  could  not,  using  our 
MTASS  toots,  build  a  bootable  object  tape  containing 


100 


M68040  code  that  could  be  loaded  into  the  UYK-7  and 
executed  in  the  processor’s  upper  128k  of  memory.  That 
upper  128k  space  is  where  the  EMU  processor’s  high 
speed  RAM  is  located.  The  reason  for  this  problem  is 
mainly  due  to  the  fact  that  the  UYK-7  and  the  EMU 
processor  have  two  separate  memory  maps  for  the  same 
physical  memory  space.  In  these  maps,  the  EMU 
processor  has  read/write  access  to  the  UYK-7  memory, 
but  the  UYK-7  cannot  access  the  ”non-shared"  processor 
memory.  This  is  depicted  in  Figure  5.  The  UYK-7’s 
lower  128k  memory  space  is  represented  by  locations  0 
to  377,777  (octal),  while  the  same  space  represents 
locations  2,000,000  to  4,000,000  (octal)  to  the  EMU 
processor  (its  second  128k  space). 

Due  to  the  difference  in  memory  mapping,  and 
the  fact  that  the  UYK-7  does  not  have  access  to  the  EMU 
processor’s  upper  memory,  no  programs  can  be  loaded 
directly  into  the  EMU  processor’s  upper  memory  via  the 
UYK-7’s  NDRO  bootstrap.  This  problem  can  be 
overcome  by  loading  the  EMU  processor  code  into 
UYK-7  memory  via  the  bootstrr^  and  "downloading"  the 
code  from  UYK-7  memory  to  EMU  processor  memory 
via  the  EMU’s  Download  command. 

The  download  method  would  seem  to  be  the 
answer  to  the  loading  problem,  however,  this  method 
unveils  another  problem.  In  order  to  execute  a  program 
somewhere  in  memory,  the  program  must  be  linked  to 
that  particular  memory  area  so  that  instruction  operands 
can  evaluate  correct  absolute  addresses  for  data  movement 
and  branches.  On  the  other  hand,  when  building  a 
bootable  upe  using  MTASS,  the  Tape  Builder  program 
(TBL)  uses  the  absolute  start  address  of  the  program 
calculated  during  link  time  as  the  address  at  which  to  load 
that  program.  In  other  words,  a  program  which  is 
compiled  and  linked  to  execute  in  EMU  processor 
memory,  cannot  be  loaded  into  UYK-7  memory. 

Until  a  more  elegant  solution  is  found,  a 
temporary  "loader"  program  is  being  created  for  the 
processor  program  which  ignores  the  starting  address 
associated  with  the  NTDS  boot  record  and,  instead,  loads 
the  EMU  Processor  program  into  a  predetermined 
location  in  UYK-7  memory.  This  allows  the  EMU 
processor  program  to  be  linked  relative  to  an  area  of 
EMU  processor  memory,  and  loaded  into  UYK-7 
memory  for  downloading. 

While  the  overall  implementation  of  changes  in 
the  MK  86  project  will  be  incremental,  the  processes 
governing  each  upgrade  remain  the  same  over  the  next  3 
years:  identify  candidate  tasks  to  be  off  loaded  to  the 
EMU  processor,  recode  whatever  assembly 
language  exists,  recompile  under  CMS-2K,  and  recertify 
performance.  This  method  supports  our  existing 
maintenance  philosophy,  requires  no  additional  personnel, 
and  is  supported  by  our  existing  standard  test  procedures. 


Lessons  Learned  &  Conclusions 

Perhaps  the  most  difficult  part  of  a  design  is 
accurately  defining  the  interface  specifications  from  the 
unit  to  be  designed  and  the  rest  of  the  system.  Most  of 
the  requirements  of  the  design  were  derived  from  the 
technical  manuals  and  specifications  in  the  fabrication 
drawing  package.  When  trying  to  duplicate  the  exact 
functions  of  a  device,  measured  data  defines  the  interface 
requirements  accurately.  But  when  increased 
performance  is  desired,  minimum  and  maximum  timing 
requirements  and  relationships  are  required.  In  over  three 
cases  during  the  design,  timing  relationships  defined  or 
implied  in  the  available  data  sources  were  found  to  be 
incorrect  which  resulted  in  design  changes  during  testing. 

Another  problem  was  noted  when  attempting  to 
change  device  types.  The  UYK-7  channel  busses  were 
implemented  with  open  collected  devices.  An  attempt 
was  made  to  use  tri-state  drivers  to  perform  the  same 
function.  Early  testing  showed  that  every  thing  would 
work  properly  with  tri-state  drivers,  but  when  the  full 
system  was  tested,  intermittent  problems  occurred  due  to 
decreased  noise  margins  introduced  by  the  tri-state 
drivers.  This  caused  a  design  change  to  open  collector 
drivers  which  are  not  as  readily  available  in  certain 
functional  type  devices. 

These  examples  emphasize  the  advantage  of 
performing  early  confidence  testing  to  help  define  design 
requirements  and  parameters. 

Finally,  this  project  demonstrates  the  advantage 
of  providing  technology  insertions  for  existing  systems. 
Benefits  are  derived  in  system  capabilities  and 
performance  as  well  as  substantial  cost  advantages. 
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ABSTRACT 

This  papa  desenbes  the  leengmeeriDg  of  the 
NAVY'S  LAMPS  MK  m  to  provide  a  new  high  data 
rate  ^totodiqHo-diip  LOS  ocxmnutiicadoacipibility. 
The  iqiplicatioa  of  {XMent  LAMPS  MK  in,  with  added 
cq)ability,  permits  ship-to-ship  LOS  data 
communication  rates  in  excess  of  near  three  times  die 
T1  rate  of  1.544  Mbps.  A  coopecative  effort  is 
undeiwav  to  demonstrate  tins  cqiabilily  witii  die 
support  and  resources  of  QNCLANTFLT.  This 
demonstration  will  consist  of  a  HDR  LAMPS  MK  in 
extension  of  the  existing  USS  Mount  Satoom 

^^deo  Teleconferoidng  (VTC)  to  an  HDR  modified 
LAMPS  MK  m  sfai^.  Si^  ^  VTC  extamaa  will 
only  operate  at  a  386  Kbps  rate,  other  Miltimedia 
loading  of  die  link  will  be  used  to  more  fidly  load  the 
communication  link.  This  VTC  demonstration  uang 
LAMPS  MK  m  will  be  discussed  in  induspqier. 


1.  BNlROTUCnON 

The  LAMPS  MK  m  reengineering  challenge  is  largely 
related  to  the  achievement  of  dus  new  HDR 
communication  cqxilnlity  widKxit  nnldng  system 
changes  that  could  iinpact  LAMPS  MK  in  primaiy 
mission  of  ASW  and  ASST  (anti-ship  survdllanoe  and 
targeting).  The  expldtation  of  this  system's  design 
features  has  been  done  in  order  to  satisfy  the 
program's  HDR  of  T-1  or  1.544  Mbps  otgective.  The 
use  of  LAMPS  MK  m  is  die  most  viable  and  cost 
efifoctive  way  firr  the  Navy  to  provide  a  large  rrunfoer 
of  combatant  ships  with  LOS  HDR  shipto-ship 
communicatioos.  Navy  R£D  program  plans  FY  95 

call  for  oeating  an  even  higher  data  rate 
communication  system.  This  new  system  only  uses  the 
LAMPS  MK  m  waveguide  and  high  gain  antenna 
syst^  These  LAMPS  MK  m  elements  are  coqrled, 

Mr.  Rahiliys  work  is  sponscaed  under  62  block  funding  by  die 
Office  of  Naval  Reseaich,  contract  ND0014-94-WX-3S038AD 


vthen  required,  to  a  new  widdiand  receive-and- 
tranmut  system  to  achieve  a  drqHo-diip  QT  shqvto- 
stme  communication  data  rate  of  T  3  or  44.7  Mbps. 
This  opebility  would  be  particularly  useful  to  satisfy 
die  diore-to-diq)  theater  extension  (lENEI) 
requirements  of  Qobal  Grid.  Hgure  1.1  provides  a 
photognq;)h  of  the  present  Clipboard  LAMPS  MKHI 
system  (SRQ4)  and  Hgure  1.2diowsdieIAMPSMK 
m  (ST^R)  liclicopter,  with  it's  LAMPS  MK  in  system 
on-board;  just  after  its  takeoff  fiom  FF  8  of  the  Fast 
Frigate  Qass.  ASW  and  ASST  data  diat  the  LAMPS 
MK  in  helicopter  collects  nhile  on  stadon,  rqi  to  100 
hkrn  fiom  its  modier  shq>,  is  transntitted  at  a  HDR  back 
to  die  diq>.  The  hkevy  h^  in  excess  of  80  diqis  with 
the  LAMPS  MKIE  The  LAMPS  MK  m  antenna 
installation  is  diown  hi^  rq)  on  the  drip's  mast  of  a 
DD  %3  in  figure  1.3. 


2.  DISCUSSION 

In  tire  early  phases  of  the  R^  investigation, 
techrucal  issues  were  examined  and  condudons 
reached  that  led  to  the  belief  tiiat  LAMPS  MK  m  was 
tire  (qrtimum  way  to  achieve  a  HDR  shqHo-drqr 
communication  c^ralnlrfy.  For  example,  die  scope  of 
the  iirvestigation  required  the  conddm^onofthe  entire 
SHF  band.  However,  as  a  result  of  a  review  of  tiie 
allowable  electromagnetic  fiequracyr^ons  inihe  SHF 
band,  \diere  nonsatellite  diipto-diq}  ccmmunications 
is  pmnntted  by  international  agreement,  it  was  fixmd 
drat  only  the  ^rectral  r^on  fiom  4.4  to  5.0  GHz  was 
allocated  firr  diis  purpose.  Therefore,  this  is  the 
(qrerating  fiequoicy  ba^  used  by  LAMPS  MK  m  to 
ntdrlish  a  communicadon  link  between  a  drip  and  a 
helicopter. 

Anotiier  issue  relates  to  the  assessment  of  the 
electimnagnetic  cortqratilnlify  of  any  newly  created 
ccxTimunication  system  with  the  existirig  LAMPS 
MK  in  shipboard  transrrutter  and  receiver  qrerating 
environment.  For  exarrqrle,  any  newly  created  system 
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Figure  1.1.  AN/SRQ-4  radio  terminal  set. 
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Figure  1.2.  LAMPS  ffl  SH60R  helicopter  after  takeoff  from  a  LAMP  m  equipped  FF8. 
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would  be  operating  in  the  same  4.4  to  5.0  GHz 
frequency  b^  as  the  LAMPS  MK  in  system  and 
must  not  interfere  with  the  high  priority  communication 
linkage  between  the  ship  and  the  SH-60B  helicopter. 
The  approach  taken  in  this  new  ship-to-ship 
communication  concept  is  to  iK)t  create  a  new  system 
that  might  interfere  wifri  LAMPS  MK  ID  but  rather  to 
use  the  LAMPS  MK  in  system  itself  to  provide  the 
ship-to-ship  communicatirxis  when  it  is  not  perfomiing 
its  operaticHial  role  with  the  LAMPS  MK  ID  helicopter. 

Another  advantage  to  using  the  existing  LAMP 
MKm  system  to  achieve  a  ship-to-ship  HDR 
C£q)ability  includes  the  feet  that  the  LAMPS  MK  ID 
high  gain  antenna  system,  because  of  its  high 
operatiaial  priority,  is  located  very  high  up  on  the 
ship's  mast,  thus  maximizing  the  LOS  range.  Real 
estate  is  very  scarce  at  these  heights  and  a  new  system 
would  find  it  difiScult  to  get  a  locaticRi  as  high  as 
LAMPS  MK  m  currently  has. 

Also,  by  using  the  LAMPS'  antenna  system,  there 
are  no  m^or  developmental  or  procurement  costs  to 
obtain  a  hi^  gain,  stabilized  antenna  fiiat  provides 
mcHiopulse  azitnuth  tracking,  has  passed  tnil-q)ec  shock 
and  vibratioa  testing,  and  has  demonstrated  highly 
successful  (^ratiaial  reliability  while  providing 
mission  siqiport  in  coiqunction  with  the  LAMPS 
MK  m  helicopter  system 

It  is  the  system  goal,  in  this  prqx)sed  use  of  the 
LAMPS  MK  shipboard  assets,  to  not  only  use  the 
above-deck  assets  but  also  to  use  to  a  maximum  degree 
all  the  below-deck  subsystems.  Ihese  subsystems 
include  the  fiequency  synthesizers,  moduladcxi, 
demodulation  subsyst^  transmit  and  receive 
subsystems,  mux  arid  demultiplex  subsystems,  and 
KG45  crypto  systems.  However,  variations  from  the 
present  LAMPS  MK  m  must  carefully  consider  the 
requirement  to  be  able  to  r<q)idly  reconfigure  the 
system  fixim  a  posture  that  siqiports  ship^ship 
oHnmunicatirxr  to  (xie  tiiat  can  imniediately  siqrport  the 
LAMPS  MK  m  missiotL  Hie  LAMPS  MK  III  mission 
must  always  be  the  primary  (^rational  mode,  and  any 
design  changes  must  always  be  evaluated  from  this 
persfiective. 


DESCRIPTION  OF  LAMPS  MKm  HDR 
COMMUNICATION  CONCEPTS 


In  order  to  most  efiectively  discuss  this  new 
concept,  it  is  best  to  first  examine  how  LAMPS  MK  HI 


communications  presently  operates.  Figirre  3.1  shows 
the  basic  fecets  of  LAMPS  MK  III  ship-helicopter 
communications.  The  shipboard  Lamps  transmit-and- 
receive  equipment  is  below-deck  and  is  connected  b>’ 
waveguide  through  the  waveguide  switch  shown  in 
figure  3.1  to  either  the  hi^  gain  or  omni  antennas, 
mounted  high  on  the  mast.  This  figure  shows  the  high 
gam  antenna  in  use.  This  occurs  when  the  helicopter  is 
a  significant  distance  from  the  ship.  The  uplink  requires 
less  than  10%  of  the  bandwidth  required  by  the 
helicopter-to-ship  downlink.  The  term  wideband  (WB) 
is  used  for  the  downlink  and  narrowband  (NB)  is  used 
for  the  iqilink. 

The  WB  downlink  includes  the  NB  reqxxise  data, 
since  it  is  multiplexed  in  with  the  wideband  downlink 
data  stream.  Thus,  a  full  duplex  communication  is 
achieved  relative  to  the  narrowband  data 
communication.  The  wideband  link  is  sinqilex  in 
nature,  since  there  has  been  no  need  for  WB 
transmission  fiom  the  LAMPS  MK  III  ship  until  now. 


3.1  LAMPS  MK  m  COMMUNICATION 
OONOFR  BASELINE  SYSTEM  WITH 
MAXIMUM  CAPABIUIY. 

The  baseline  LAMPS  MK  m  HDR  communication 
concept  consists  of  using  the  shipboard  LAMPS  MK  in 
(SRQ4)  and  integrating  tiiese  assets  together  witii  the 
LAN^  MK  in  transmit,  receive  and  mirx/demux 
assets  that  are  used  on  the  SH-60B  helicqiter 
(ARQ-44).  Figure  3.2  illustrates  this  ccxicept.  As  may 
be  seen  in  this  sketch,  a  new  dual  rotary  waveguide 
switch  now  replaces  the  present  single  action  switch. 
With  the  waveguide  switr^  positicxi  shown,  die  SRQ-4 
receive-and-transmit  system  is  connected  to  the  high 
gain  antenna,  while  the  ARQ44  is  connected  to  the 
CHimi  antenna.  This  is  the  ship  A  configuration.  Ship  B 
is  at  the  other  end  of  the  lirik  and  its  connectivity  is 
jirst  reversed.  Therefore,  it  can  be  seen  fixnn  figure  3.2 
that  the  high  gain  antennas  are  transmitting  on  the 
narrowband  (NB)  data  links  and  the  zero  db  gain  omni 
antennas  are  transmitting  the  wideband  (WB)  data 
links.  It  should  also  be  noted  that  by  rotating  the  SMA 
switches  the  NB  or  WB  operaticni  can  be  conducted  in 
either  LB  (lower  band)  or  in  the  UB  ( iqrper  band) 
This  is  required  in  order  to  achieve  sufficient  fiequency 
separation  to  afford  the  needed  isolation  between  the 
transmitted  energy  fixrm  a  given  antenna  and  the  low 
level  signals  that  are  being  simultaneously  received  by 
the  other.  This  signal  isolation  is  presently  obtained 
using  two  broadband  RF  bandpass  filters  in  the  lower 
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FIGURE  3.2 


and  upper  fiequency  bands  in  each  system.  This 
arrangement  provides  a  diplexer  with  as  tni^  as  90  dB 
of  isolation  between  the  receive  and  transmit  signals 
when  the  ship  is  using  either  the  high  gain  antenna  or 
the  omnni  antenna.  From  Figure  3.2  we  can  see  that 
die  ARQ44  is  transmitting  \VB  in  the  LB  filter  and  the 
SRQ-4  is  receiving  WB  in  the  UB  to  achieve  the 
needed  transmit  to  receive  isolaticm.  Therefore,  cm 
each  of  the  platfomis  the  separation  between  the 
diplexer  banc^ass  filters,  located  in  the  UB  and  LB,  is 
the  same  and  is  sufficiently  great  that  transmissicn  and 
reception  can  occur  simultaneously  without  causing 
oorn^on  of  die  quality  of  die  received  data. 

For  diis  test  system  the  UB  banc^iass  RF  filter  used 
for  die  WB  transmit  and  receive  fiirictions  and  LB  RF 
fitter  used  for  MB  transmit  and  receive  fimctions  have 
not  been  tailored  for  diis  test  applicatioa  It  has  been 
fixmd  by  testing  the  LAMPS  NK  IB  diplexer  filters 
that  the  bam^iass  characteristics  of  die  UB  filter  was 
about  an  order  of  magnitude  larger  than  was  eiqiected. 
Consequendy,  it  will  be  possible  to  allow  qieratian  of 
the  WB  transmit  throu^  it  in  about  seven  channel 
locations.  For  this  reasmi,  it  is  estimated  that  full 
dtplex  qieration  could  occur  in  seven  channels.  In  the 
foregoing  exanple  we  have  proceeded,  using  the 
present  LAMPS  MK  in  cmununication  architecture,  to 
create  two  simultaneous  full  di^lex  loops;  one  for 
rianowband  and  the  other  for  wideband  communicatiotL 
Whedier  the  present  90  db  isolation  can  be  achieved  or 
not  ronains  to  be  determined.  If  the  isoiaticm  level  is 
unsatisfiictory  for  simultaneous  full  duplex  WB  and  MB 
operation,  dum  nonsimultaneous  WB  or  NB  full  diqilex 
axnmunication  mode  could  solve  the  resulting 
interforence  problem.  To  achieve  the  WB/NB  capability 
shown  in  figure  3.2,  the  following  would  be  added  to 
the  basic  HDR  LAMPS  MK  HI  c^jability  in  order  to 
equip  two  ^ps  with  a  maximum  crpability; 

1.  (2)  dual  waveguide  switches 

2.  (2)  SMA  type  switches  at  input  to  the  R/T 

subsystems. 

3.  (2)  mods  to  Voltage  Tuneable  Signal  Sources 

(VTSS)  to  permit  shifting  fiequency 
references  in  accordance  with  the  new  WB 
UB  and  NB  LB  operation  while  still  being 
able  to  meet  LAMPS  transmit  and  receive 
bands  and  channelization  in  the  normal 
mode  (1). 


(1)  Note:  The  mods  of  this  new  VTSS  subsystem 
have  been  now  built  and  tested. 


4.  SHIP  TO  SHIP  HDR  LAMPS  MK  III 
OXVlMUNICATiaS  DBVIONSIRATION 
SYSTTM 

Operational  demonstrations  that  might  be 
associated  with  the  evolution  of  this  new  ship^ship 
high  data  rate  communication  capability  can  take  many 
forms.  The  first  of  these  demcnistraticHis  begins  with 
tests  in  the  laboratory.  At  the  next  level  we  will  make 
a  bridge  between  the  laboratory  and  the  world  of  the 
Navy  ccHTibatant  ship  that  already  has  a  LAMPS 
MK  in  system  on-boand.  The  ^redficsofthe  areas  of 
investigation,  in  each  of  the  above  categories,  is 
contained  below. 


4.1  NAVY  LAB(»AT0RV  TESTING 

The  work  in  this  area  was  begun  last  year  because  of 
the  support  provided  by  NESEA,  St.  Inigoes,  MD. 
Althoi^  the  documentation  sipporting  the  LAMPS 
system  is  quite  extensive,  there  are  still  many  areas 
^^re  documentation  deficiencies  were  rroted.  Where 
information  deficiencies  occurred  detailed  design 
information  and  data  on  subsystem  characteristics  were 
derived  by  NESEA  In  these  situations  NESEA 
crmducted  various  measurements  on  the  LAMPS 
equipment  to  reveal  this  needed  deagn  information. 
Because  ofsecurity  classification  issues,  some  of  these 
results  cannot  be  presented. 

The  laboratory  testing  in  preparation  for  the  at-sea 
demmstration  has  been  conduced  at  NAVSEA  To 
date,  testing  has  shown  that  the  new  fiequency 
references  for  upper  band  WB  and  lower  band  NB 
operation  are  fimctioning  properly.  A  considerable 
effort  has  been  devoted  to  the  d^  interface  for  the  test 
system.  The  LAMPS  MK  IQ  WB  downlink  has 
evolved  along  telemetry  lines,  and  the  transmitted  data 
words  are  commutated  into  specific  data-fiaming 
locations.  Therefore,  along  with  the  WB  data  which 
may  be  in  either  the  LAMPS  Radar  or  ASW  charmels, 
there  are  also  discrete  data  fiame  locations  that  are 
used  concurrently  to  carry  26-bit  blocks  of  data  plus 
6  bits  of  EDAC.  These  slots  are  used  to  cany  con^uter 
cjata^  voice,  and  DME  data.  The  conputer  channel  is 
used  for  command  and  control  information  between  the 
ship  and  the  helicopter.  The  other  two  fiaming  slots  are 
available  for  the  data  block  transmission  that  results  in 
an  mdependent  56  Kbps  digital  voice  link  and  a  26-bit 
data  block  that  is  used  for  distance  measuring 
equipment  (DME)  between  the  ship  and  the  helicopter. 
The  HDR  data  interface  challenge  develops  from  the 
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fact  that  the  LAMPS  MK  111  data  flow  is  not  a 
continuous  flow  of  one  type  of  data  as  occurs  in  the 
noimal  commercial  T-1  communications  world.  Data 
buffering  using  a  PC-based  FIFO  (first-in,  first-out) 
buffering  system  and  a  special  serial  I/O  card  plus 
strict  timing  controls  have  been  developed  by  the  Navy 
to  achieve  this  interfece  crxnpatibility.  This  required 
data  interface  has  been  develop  jointly  by  NRaD  and 
NESEA. 

The  next  [base  is  the  laboratory  testing  of  the 
system  which  includes  all  the  equipment, 
Tlmeplexer  multiplexers,  data  interfiices,  HDR  SRQ4 
and  HDR  ARQ-44  terminals,  including  the  Cesium  and 
Rubidium  time  standards  and  cryptos  that  would 
represent  video  data  to  RF  interconnection  of  die  two 
HDRLAh^  equipped  ships.  This  would  duplicate  full 
diqilexVTC  at-sea  testing  using  the  HDR  LAMPS  that 
is  to  take  place  befiire  the  end  of  February  1994. 


4.2  AT-SEA  VTC  DEMONSTRATION  TESTING 
OF  THE  HDR  LAMPS  MK  m 
C30MMDNICATTONS  SYSTEIVC 

Figure  4.1  presents  a  pictorial  representation  ofthe 
at-sea  VTC  testing  with  ^  l/SS  Mnint  Whitney  that 
is  planned  for  the  ^ril/May  1994.  Prior  to  this  test 
diere  will  be  an  at-sea  test  using  only  two  LAMPS 
MK  m  ships  operating  on  a  point-to-point  basis.  In  this 
eady  at-sea  test  there  will  be  no  requirement  to 
interfocewitfadie  Satccxn  VTC.  The  April/May  testing 
will  require  die  Satoxn  interface  diat  will  permit  video 
teleconforendng  between  the  CINCLANTFLT 
Headquarters  and  the  HDR  LAMPS  MK  m  ship.  The 
USS  Mount  Whitney  does  not  currendy  have  a  LAMPS 
MK  m  installadon,  so  it  is  being  equipped  with  a 
shelter  to  house  the  HDR  LAMPS  MK  O  capability 
and  will  use  an  omni  antenna  that  is  telescopically 
mounted  to  the  roof  of  the  shelter. 

9up-to-ship  VTC  tests  will  be  conducted  at  sea  at 
various  ranges  and  shipboard  EMI  (electromagnetic 
interference)  crxiditicxis.  This  denxxistradcxi  will  reveal 
the  level  of  electromagnetic  interference  that  will  be 
eiqierienced  by  the  HDR  LAMPS  MK  111  receive 
system  in  the  WB  mode.  Also,  the  level  of  RF 
interference  with  other  ship  systems  that  will  be  created 
by  LAMPS  transmitter  emitting  a  WB  signal,  using 
either  the  omni  or  the  high  gain  directional  antenna  at 
SHF,  will  be  examined. 


43  MINIMUM  GOST  HDR  LAMPS  MK  10 
VTC  DEMONSTRATION  SYSTEM 

Figure  4.2  presents  the  HDR  LAMPS  MK  111 
minimum-cost  system  to  be  used  in  at-sea  VTC  tests. 
This  system  will  allow  full  duplex  WB  data 
ccanmunications  between  two  ships  to  suppon  the 
VTC.  As  in  die  HDR  configuratiais  discussed,  it  is 
necessary  to  integrate  an  ARQ-44  with  the  shipboard 
SRQ4  to  achieve  this  WB  capability.  Figure  4.2  shows 
that,  since  diis  is  a  test,  simplification  of  the  system  is 
possible.  It  can  be  noted  that  ship  A  and  ship  B  have 
particular  cormectiois  of  the  baiH^iass  filter  to  die  WB 
receiving  and  WB  transmitting  subsystems.  Both  of 
these  connections  are  manually  made  via  SMA  lines 
widiin  the  SRQ4  and  die  ARQ-44.  The  restrictioi  to 
WB  qieiatiai  removes  the  need  for  the  dual  waveguide 
switch  shown  in  figure  3.2.  When  both  ship  A  and 
ship  B  are  LAMPS  MK  m  ships,  then  either  one  could 
be  ship  A  or  ship  B,  since  they  bodi  have  the  same 
antenna  systems.  In  this  drawing  it  can  be  noted  that  in 
the  ship  B  crxifiguratiois  the  AR(^-44  would  be 
coifigured  as  h  normally  is  in  that  it  transmits  WB  on 
the  LB/BPF.  On  the  SRf^  side,  we  can  see  that  the 
shipboard  system,  which  normally  receives  WB  on  the 
LB0PF,  is  now  to  receive  it  finxn  ship  A  in  the 
UB/BPF.  When  the  Satcom  VTC  testing  occurs,  the 
shelter  that  contains  the  HDR  LAMPS  MK  lH 
equiixnent  will  be  represented  by  the  configuration 
shown  as  ship  A  since  it  will  use  the  0  db  omni 
antenna.  The  data  interface  developed  for  this  system 
is  as  shown  and  for  a  number  of  reascxis  will  operate 
only  at  twice  the  T-1  rate  or  about  3  Mbps.  Other  than 
the  SMA  switdiing  shown  and  the  VTSS  changes,  the 
rest  of  the  ARQ-44  and  SRQ-4  radio  temiinal 
equipment  will  remain  the  same  as  it  is  today. 


5.0  PREDICIED  OOVIMUNICATIQN 
HSRFC»MANCE 

Using  a  Navy  cornputer  program  (PQ  called 
SLAM  that  allows  a  parametric  examination  of  the 
communicatioi  performance  of  any  particular  system, 
including  the  effects  of  multipath,  propagation 
diffiaction,  and  ducting,  the  plot  shown  in  Figure  5.1 
was  developed.  The  actual  diaracteristics  of  the  present 
LAMPS  system  were  utilized  including  the  system 
losses  that  affected  both  the  transmitted  signal  as  well 
as  the  received  signal.  These  losses  were  pro\ided  in 
the  engineering  documentation  in  the  Navy  files  on  the 
LAMPS  MK  in.  The  term  on  the  ordinate  in  Figure  5.1 
is  called  margin  and  stands  for  the  signal  power  level, 
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LAMPS  MK  III  HDR  TEST  CONnCURATION  FOR  AT-SEA 
VIDEO  TELECONFERENCING  TEST 
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FIGURE  4.2 


PREDICTED  SHIP  TO  SHIP  COMMUNICATIONS  LINK 
PERFORMANCE  USING  HDR  LAMPS  MK  HI  DURING 
AT-SEA  VIDEO  TELE-CONFERENCE  TEST 


(aa) 
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nCURES.l 


in  db,  greater  than  that  level  which  will  just  allow 
achieving  the  desired  data  bit  error  rate  (BER). 

For  the  link  computations  that  resulted  in 
Figure  5.1,  it  was  assumed  that  the  antenna  on  each 
ship  was  at  100  ft  above  sea  level,  and  this  results  in  a 
maximum  line-of-sight  range  of  about  20  nmi.  ^Vhen 
sufficient  excess  power  is  available  range  values  will 
be  in  excess  of  tte  LOS  limit  out  to  a  diffiacdon  loss 
limit.  Figure  5.1  shows  the  expected  perfotmance  of 
the  LAMPS  communication  link  between  die  high  gain 
LAMPS  antenna  and  the  LAMPS  "0"  db  antenna  as  a 
function  of  range  assuming  a  conmunicaticHi  data  rate 
of  4  Mbps  with  a  Eh/Nb  =  12  db.  Ihe  shaip  downward 
spikes  in  figure  5.1  are  due  to  multipath  signal 
degiadatitxi  effects.  The  worst  of  these  can  be  seen  to 
be  about  5  db  above  the  level  which  would  create 
signal  degradation.  This  is  regarded  as  the  multipath 
safety  margin  that  exists  in  diis  link.  Ihe  received 
signal  power  is  sufficietidy  strtxig  as  to  allow 
communication  out  to  ranges  of  about  25  nmi  including 
the  link  dififeaction  losses. 


6.0  cmaxjsiONs 

The  US  Navy  has  a  valuable  resource  in  the 
LAMPS  MK  in  system  and  this  has  been  demoistrated 
in  its  remarkable  operatiaial  perfotmance.  It  has  been 
operational  in  the  fleet  for  at  least  10  years.  It  is  a  well 
engineered  system  that  in  many  ways  is  sdll  very 
so{4iisticated.  Its  excellent  perfotmance  results  fiomd^ 
soundness  of  its  engineering.  It  is  an  example  of  what 
is  possible  in  operational  fleet  electnxiics.  For  these 
reasons  it  represents  a  very  sound  foundation  upcxi 
which  to  create  another  teinatkable  level  of  LAMPS 
MK  in  usefulness  to  the  Navy.  It  presently  has  an 
operational  role  which  it  performs  efficiendy  and 
effectively.  Therefore,  variations  fixxn  its  present  design 
have  to  be  very  carefiilly  evaluated  to  be  sure  that  no 
reducticai  in  present  operational  capability  will  occur  if 
any  design  changes  are  to  be  inplement^. 

This  paper  has  touched  on  some  specific 
ccxnmunication  and  cperational  scenarios.  Others  being 
examined,  but  not  discussed,  includes  networking  of  an 
ethemet  type  between  HDR  LAMPS  MK  HI  ship's 
that  uses  the  high  gain  to  omni  link  between  ships  to 
achieve  a  4  Mbps  network  trunk.  In  another  version 
networking  is  achieved  cmly  between  the  HDR  LAMPS 
MK  in  omni  antennas  at  a  NB  rate.  \Vhat  the  HDR 
mode  offers  to  the  Navy  is  a  communication  trunk  that 
can  serve  many  forms  of  data  communication 


simultaneously.  It  is,  in  fact,  ideally  suited  to  serving 
the  new  world  of  multimedia  including  large  file 
transfers,  digital  interactive  video,  high  resolution 
image  transmission  and  digital  voice.  There  are  also  a 
variety  of  possible  operational  military  configurations, 
that  appear  to  have  very  desirable  features.  At  this 
juncture  the  technical  investigations  and  operational 
testing  will  continue  so  that  greater  insight  will  be 
developed  as  to  its  most  desirable  and  achievable 
qjerational  capability. 
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Abstract 

The  results  and  lessons  learned 
from  a  STARS  (Software  Technology  for 
Adaptable  Reliable  System)  sponsored 
process  technology  transfer 
demonstration  are  presented  in  this  paper. 
The  Armament,  Munitions  and  Chemical 
Command  (AMCCOM)  Life  Cycle 
Software  Engineering  Center  (LCSEC)  at 
Picatinny  Arsenal  was  selected  to 
demonstrate  that  Cleanroom  Software 
Engineering  (CSE)  and  Process-Guided 
Project  Management  (PGPM)  could  be 
successfully  applied  in  a  typical  DoD 
Software  Support  Activity  (SSA). 
Results  indicate  ^at: 

•  CSE  practices  and  PGPM  can  be 
successfully  transferred  to  a  typical 
DoD  SSA, 

•  engineering  staff  productivity  and 
quality  was  increased  while 
simultaneously  increasing  job 
satisfaction,  and  a 

•  return  on  investment  of  at  least  5.9:1 
has  been  realized  on  the  first  project 
to  which  CSE  and  PGPM  techniques 
were  applied. 


Technology  Transfer  Goal 

The  goal  for  the  technology 
transfer  effort  for  the  AMCCOM  LCSEC 
at  Picatinny  Arsenal  was  to  conduct  a 
demonstration  of  CSE  practices  and 
process-guided  project  management  at  a 
typical  DoD  SSA. 


AMCCOM  LCSEC  was  selected 
in  response  to  their  expressed  interest  in 
improving  the  process  by  which  they 
maintain  software  in  general  and, 
specifically,  in  using  the  CSE 
technology.  Additionally,  as  a  typical 
DoD  SSA,  it  was  deemed  important  to 
improve  the  means  by  which  the 
government  spends  their  largest  portion 
of  software  money;  i.e.,  in  software 
maintenance  and  re-engineering  (as 
opposed  to  new  software  development). 
TTie  demonstration  was  to  be  facilitated 
by  IBM  and  SET  (Software  Engineering 
Technology,  Inc.). 

The  LCSEC  at  Picatinny  Arsenal 
is  a  representative  DoD  Software  Support 
Center  that  wants  to  apply  a  more  formal 
approach  to  software  support.  The 
current  state  of  software  re-engineering  at 
the  AMCCOM  LCSEC  varies  from 
project  to  project  but  the  majority  have 
not  achieved  the  desired  level  of 
productivity  and  quality.  A  major  goal  of 
AMCCOM  LCSEC  is  to  achieve  an 
Software  Engineering  Institute  Capability 
Maturity  Model  (SEI  CMM)  Level  3 
rating  by  adopting  an  evolutionary 
process  improvement  approach  to 
software  re-engineering.  Currently 
AMCCOM  LCSEC  is  receiving  support, 
under  STARS  Task  IA02  from  IBM  and 
SET,  in  applying  the  Cleanroom 
approach  on  the  re-engineering  of  the 
Mortar  Ballistic  Computer  (MBC).  Initial 
results  have  been  successful  in  terms  of 
the  projects  employing  the  Cleanroom 
engineering  practices  and  adopting  a 
process  driven  team  organization. 
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Organization  Overview 

DoD  Software  Support  Centers 
(SSCs)  provide  important  opportunities 
to  demonstrate  STARS  efforts  to  improve 
software  quality  and  productivity.  SSC 
activities  represent  a  major  portion  of  the 
DoD  software  budget  and  the  proportion 
is  expected  to  be  increased  during  the 
next  decade.  This  will  occur  as  the  many 
systems  in  the  DoD  development  pipeline 
are  turned  over  to  SSCs  for  support.  It  is 
likely  that,  as  fewer  new  systems  come 
into  the  inventory,  DoD  managers  will 
attempt  to  extend  the  useful  life  of  old 
systems  through  software  enhancements 
and  re-engineering.. 

The  AMCCOM  LCSEC  provides 
a  number  of  services  including:  software 
acquisition  support  to  program  managers, 
computer  resource  life  cycle  management 
plans,  pre-planning  for  software  support, 
manage  contracted  post  deployment 
Software  support  efforts,  software 
configuration  management,  and  design 
and  implement  software  changes.  The 
types  of  battlefield  automated  systems 
supported  include  air  defense,  cannon 
and  tank  gun  systems,  smart  mines  and 
munitions,  ballistics  computers,  gunnery 
simulators,  trainers,  and  nuclear 
biological  chemical  detection  systems. 

The  MBC  project  was  a  re¬ 
engineering  of  the  current  system  used  by 
the  US  Army  to  aim  Mortars  for  combat 
support.  The  existing  MBC  was 
implemented  in  DTL  (Display  Terminal 
Language)  and  Z-80  Assembler  that  is  not 
easily  upgraded  for  new  requirements. 
The  re-engineered  system  was 
implemented  in  Ada  and  as  a  result  can  be 
moved  to  new  updated  hardware 
platforms. 

The  STARS  program  is  a  DoD 
funded  research  and  development  effort 
funded  under  ARPA  (Advanced  Research 
Projects  Agency).  The  main  thrust  of  this 
effort  is  that  software  development  is 
process-driven,  domain-specific,  reuse- 
based,  and  supported  by  an  integrated 


software  engineering  environment.  This 
concept  is  called  Megaprogramming  and 
the  STARS  program  is  currently  engaged 
in  several  demonstration  projects  of  the 
technologies  developed  earlier  in  the 
program.  Picatinny  MBC  effort  is  the 
first  demonstration  project  to  use  STARS 
concepts.  IBM  is  one  of  the  prime 
contractors  for  this  effort  and  SET  is  a 
principal  subcontractor  for  the 
IBM/STARS  effort 

The  desire  for  process-driven 
technology  was  the  result  of  a  Software 
Process  Assessment  (SPA)  conducted  by 
a  team  of  representatives  from  the  AMC 
LCSECs  with  coaching  from  the 
Software  Engineering  Institute  (SEI). 
AMCCOM  LCSEC  management  has 
developed  a  close  relationship  with  the 
SEI  because  they  desired  help  with 
identifying  areas  to  achieve  me  desired 
level  of  productivity  and  quality.  Review 
of  the  SPA  findings  lead  AMCCOM 
LCSEC  management  to  realized  that  the 
software  engineering  process  was  not 
under  intellectual  control.  Each  new 
software  project,  whether  performed  by 
contractors  or  civil  servants,  was  treated 
largely  as  new  activity  that  did  not 
necessarily  draw  on  prior  experience  for 
process  improvement.  The  only  factor 
that  perpetuated  experience  was  people, 
be  it  government  or  contractor,  who 
participated  in  the  same  projects  time  after 
time.  Documentation  received  by 
Picatinny,  when  they  were  given  systems 
to  maintain,  was  poor  or  not  up  to  date 
and  no  defined  process  existed  for 
maintaining  continual  project  control.  In 
other  words,  the  state-of-the-practice 
consisted  of  traditional  software 
engineering  practices  that  are  ad-hoc  in 
nature,  as  opposed  to  a  disciplined, 
defined  software  engineering  process. 
These  realizations  and  the  results  from  the 
SPA  were  the  basis  for  their  move  to 
enhance  their  software  engineering 
capabilities. 

Typical  DoD  SSA  organizations 
have  immature  processes  and  are  subject 
to  morale  problems  among  software 
engineers  due  to  the  combination  of  an 
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undefined  manner  of  doing  work,  along 
with  a  lack  of  task-oriented  scheduling. 
The  software  engineers  at  the  AMCCOM 
LCSEC  did  their  work  well  because  of 
individual  skills,  but  often  seemed  to  be 
stuck  in  the  same  "groove,"  where  the 
same  situations,  in  terms  of  schedule, 
would  arise  year  after  year.  A  general 
lack  of  enthusiasm  pervaded  our  initial 
discussions  with  project  teams. 

Despite  these  difficulties, 
however,  the  customers  (various  users 
within  the  US  Army)  indicate  that  they 
are  basically  content  with  the  quality  of 
the  products.  Not  many  field  reports  of 
failures  are  submitted  by  their  customers, 
due  to  extensive,  pre-release  user  testing. 
Unfortunately,  evidence  suggests  that  this 
may  also  result  from  the  absence  of 
formal  failure  observation  and  reporting 
mechanisms,  making  the  field  quality  of 
AMCCOM  LCSEC  developed  products 
difficult  to  ascertain. 

AMCCOM  LCSEC  management 
recognized  the  problems  with  their  state 
of  the  practice  and  took  the  initiative  to 
recognize  Cleanroom  Software 
Engineering  (CSE)  and  Process-Guided 
Project  Management  (PGPM)  as  the 
mechanisms  with  which  to  facilitate  the 
desired  cultural,  technical  and  process 
changes. 


Cleanroom  Software 
Engineering  (CSE) 

CSE  was  chosen  as  the  process 
driven  technology  because  it  addresses 
the  deficiencies  identified  during  the 
LCSEC  Software  Process  Assessment 
(SPA).  CSE's  management  and 
development  team  approach  was 
consistent  with  quality  management 
phil  osophy,  e.g.  workforce 
empowerment,  process  focus,  and 
quantitative  orientation.  It  provides  for 
the  transition  of  process  technology  to  the 
project  staff  and  integrates  sever^  proven 
software  engineering  practices  into  one 
methodology.  LCSEC  management 


anticipated  productivity  gains  and  morale 
enhancement  from  the  introduction  of  the 
technology. 

CSE  consists  of  a  body  of 
practical  and  theoretically  sound 
engineering  principles  applied  to  the 
activity  of  software  engineering. 
Cleanroom  consists  of  a  thorough 
specification  phase;  resulting  in  a  six  part 
specification,  including  a  precise,  black 
box  description  of  the  software  part  of  a 
system.  Software  development  proceeds 
from  the  black  box  specification  via  a 
step-wise  refinement  procedure  using 
box-structured  design  concepts.  This 
process  focuses  on  defect  prevention, 
effectively  eliminating  costly  error 
removal  phases  (i.e.,  debugging)  and 
produces  verifiably  correct  software 
parts.  Development  of  software  proceeds 
in  parallel  with  a  usage  specification 
of  the  software.  This  usage  profile 
becomes  the  basis  for  a  statistical  test 
of  the  software,  resulting  in  a  scientific 
certification  of  the  quality  of  the  software 
part  of  the  system. 

A  quick  high  level  comparison 
between  the  typical  development  and  CSE 
philosophy  of  software  development  is 
summarized  in  Table  I.  The  typical 
development  environment  can  be 
characterized  by  craft  based  techniques 
which  are  highly  dependent  upon  the 
skills  of  the  individuals  involved  whereas 
CSE  is  an  engineering  discipline  with 
associated  rigor  and  formality. 
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Table  I:  Comparison  between  Typical  Development  and  CSE 


Characteristic 

Typical  Development 

CSE 

Programs  regarded  as 

Lines  of  Instructions 

Correct  rule  for  a  function. 

Specification  focus 

Incomplete  description  of 
external  behavior  and 
internal  design  details. 

Complete,  precise 
description  of  external 
behavior;  design  details  left 
for  development. 

Specification  to  code 
transformation  process 

Informal,  debugging  to 
verify  code. 

Stepwise  refinement  and 
verification  using  Box 
Structures. 

Failures  are 

Expected  and  accepted. 

Unacceptable. 

Testing  strategy 

Futile  attempt  for  coverage 
and  little  insight  on  field 
reliability. 

Random  sample  based  on 
usage  model  that  predicts 
field  reliability. 

Transfer  of  CSE  Technology  The 
transfer  of  CSE  technology  was  achieved 
through  formal,  classroom-style  training 
courses  and  follow-on  coaching  of 
demonstration  team  members.  The 
courses  involved  instruction  on  the 
underlying  specification,  development, 
and  certification  methods  of  CSE  and 
included  in-class  workshops  so  that 
students  gained  experience  applying  the 
technology.  As  often  as  possible, 
workshops  were  supplemented  with 
examples  extracted  from  the  MBC 
project.  Training  provided  the 
introduction  to  and  initial  experience  with 
the  tools  that  would  help  enhance 
individual  and  team  performance. 

Project  support  was  given  to  the 
team  members  through  repeated  on-site 
coaching  visits  by  CSE  experts  from  IBM 
and  SET.  This  activity  helped  to  solidify 
the  new  ideas  as  team  members  saw  how 
the  techniques  were  applied  to  their 
specific  problems. 

The  major  intent  of  the  training 
and  coaching  was  to  establish  the  human 
behavioral  changes  necessary  to  develop 
better  software.  Implementing  CSE  is  an 
intellectually  challenging  process  that 


instills  specific  values  into  its 
participants.  For  example,  the  focus  on 
product  quality,  a  major  Cleanroom 
theme,  instills  a  "get  it  right  the  first  time" 
attitude  into  the  members  of  CSE  teams. 
As  successes  were  made  and  milestones 
conquered,  the  CSE  teams  reported 
significant  improvements  in  job 
satisfaction,  team  spirit,  and  the  desire  to 
continue  quality  improvements.  A 
significant  focus  of  the  coaching  effort 
was  to  positively  reinforce  each  project 
success  in  order  to  create  a  stronger 
identity  with  the  project. 

Such  behavioral  changes  within  a 
project  are  improved  by  active 
participation  from  all  levels  of  the 
organizational  hierarchy  from 
contributing  technical  leads  to  engineering 
management  The  initial  plan  was  for  the 
project  staffs  to  work  closely  as  teams, 
rather  than  as  individuals.  Additionally, 
the  intention  was  for  the  staffs  to  be 
motivated  and  excited  about  what  they 
were  doing;  that  is,  have  a  strong  identity 
with  the  process  and  project.  Thus, 
coaching  contained  a  "cheer  leading" 
aspect,  designed  to  create  a  healthy 
Cleanroom  environment 
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Reinforcement  of  CSE  was 
provided  through  the  availability  of  a  six 
volume  set  of  process  manuals  to  the 
demonstration  groups.  These  process 
manuals  were  an  integral  part  of  the 
training  program  and  were  discussed  in 
detail,  both  during  the  formal  training 
sessions  and  off-line  as  a  part  of  the 
follow-on  coaching  activities.  Their 
purpose  was  to  augment  the  training  by 
providing  reference  information  to 
AMCCOM  LCSEC  engineers  using 
Cleanroom  concepts.  They  serve  as  a 
single  reference  source  for  resolving 
questions  about  specific  issues 
concerning  process  adherence.  The 
process  manuals  are  organized  as 
follows: 

Volume  1:  Cleanroom  Engineering 
Process  Introduction  and 
Overview 

Volume  2:  Organization  and  Project 
Formation  in  the 
Cleanroom  Environment 

Volume  3:  Project  Execution  in  the 
Cleanroom  Environment 

Volume  4:  Specification  Team 
Practices 

Volume  5:  Development  Team 
Practices 

Volume  6:  Certification  Team 
Practices 

The  division  of  the  volumes  represents  a 
separation  of  concerns  for  the  various 
project  stakeholders. 

Process-Guided  Project 
Management  (PGPM) 

CSE  is  a  formal  process  that 
clearly  defines  the  tasks  necessary  for  the 
engineering  effort  to  progress,  the 
completion  conditions  for  each  task,  and 
the  control  flow  that  dictates  the  order  of 
work  on  each  task.  Process-guided 
project  management  entails  the  use  of  a 
clearly  defined  process  as  the  approach  to 
be  used  to  complete  the  particular  project. 
The  intent  with  process-guided  project 
management  is  to  give  engineers  a  clear 
and  understandable  road  map  which  they 


can  follow  and  by  which  they  may  track 
progress  towards  project  completion. 

Transfer  of  PGPM  methods  : 
Awareness  of  software  process  is  a  key 
issue  in  successfully  transferring 
technology  to  an  organization  and  to  an 
organization's  long  term  success  with 
applying  CSE.  The  project  staffs  at 
AMCCOM  LCSEC  received  an 
introduction  to  process  definition  and 
process-guided  engineering  in  the  context 
of  CSE.  Coaching  also  reinforced  the 
importance  of  following  the  defined 
process  and  using  the  process  definition, 
which  defines  the  possible  project 
alternatives,  to  support  the  selection  of 
correct  project  choices. 

In  addition  to  training  and 
coaching,  the  engineering  handbooks 
provide  a  key  reinforcement  of  the 
concepts  of  process-guided  engineering. 
Each  volume  defines  the  tasl«  and  the 
control  flow  between  the  tasks  necessary 
to  conduct  the  specific  process  which  is 
the  focus  of  the  manual.  Engineering 
processes  are  defined  as  formal  control- 
flow  procedures  with  specific  completion 
conditions.  Collections  of  engineering 
processes  also  have  the  same  level  of 
formalized  control  flow  and  completion 
conditions.  Thus,  each  engineer, 
manager  or  other  staff  member  has  well 
defined  roles  and  tasks  that  exist  as  a  part 
of  a  larger  software  process. 

The  application  of  the  process  is 
supported  by  formal  enactment  of  the 
tasks  defined  in  the  handbook.  For  the 
MBC  team,  this  enactment  was  automated 
in  the  Cleanroom  Engineering  Process 
Assistant  (CEP A),  an  automated  process 
support  system  which  had  the  following 
mission; 

1.  To  minimize  time  lost  because 
supporting  activities  are  not 
properly  coordinated.  CEPA  was 
to  significantly  improve  the 
probability  that  all  of  the  pre¬ 
requisites,  tools  and  data  that  an 
engineer  needs  to  do  a  task  are 
available  with  no  wasted  time  on 
his  or  her  part. 


12A 


2.  To  enable  engineers  to  follow  the 
Cleanroom  process  and  thereby 
obtain  all  of  its  benefits. 

3.  To  enforce  the  Cleanroom  process 
in  the  most  unobtrusive  way 
possible  by  being  user-friendly. 

4.  To  enable  all  levels  of 
management  to  plan,  schedule  and 
control  project  tasks  and  to  ensure 
that  the  required  reviews  and 
verifications  take  place. 

5.  To  facilitate  the  collection  of  all 
required  metrics  for  providing 
statistical  control  of  the  process 
and  for  providing  better  estimates 
of  development  time  and  cost 

6.  To  update  on-line  state  data,  the 
data  needed  to  develop  the 
product,  and  make  it  immediately 
available  to  all  members  of  the 
project  team. 

7.  To  improve  formal  and  informal 
communication  between  the 
members  of  the  group. 

The  engineering  handbooks,  and 
the  automated  enactment  gave  project 
staff  a  way  to  use  a  project  framework 
(the  process  model  for  the  project)  that 
focilitates  scheduling,  task  dispatching 
and  task  statusing. 

Technology  Transfer 
Description 

To  conduct  the  demonstration, 
both  control  and  demonstration  groups 
were  identified.  The  control  group 
consisted  of  a  sample  set  of  ongoing  and 
completed  software  projects  at  the 
AMCCOM  LCSEC.  These  projects 
represent  the  use  of  "typical"  software 
engineering  methods  at  the  AMCCOM 
LCSEC.  Enhancement  projects  at 
Picatinny  typically  include  the  correction 
of  observed  problems,  the  addition  of 
new  capabilities,  and  in  some  cases,  re¬ 
engineering  of  software.  The 
demonstration  project  was  the  Mortar 
Ballistics  Computer  (MBC)  re¬ 
engineering  effort.  The  demonstration 
aspect  of  this  project  was  the  adoption  of 


the  CSE  technology  and  PGPM 
techniques  as  provided  by  the 
participation  of  IBM  and  SET.  The 
hypothesis  to  be  confirmed  or  rejected  in 
this  demonstration  was:  The  use  of 
CSE  practices  and  process-driven 
project  management  improves  the 
effectiveness  (quality  and 
productivity)  of  the  AMCCOM 
LCSEC  software  support  mission. 

The  technology  transfer  package 
was  implemented  as  follows: 

(1)  the  transfer  of  Cleanroom 
Engineering  practices  to  give  team 
members  &e  technical  tools  that 
provide  the  human  behavioral 
changes  necessary  to  create  high 
quality  software  with  increased 
productivity,  and 

(2)  the  transfer  of  process-guided 
project  management  to  orient  both 
individuals  and  teams  to  thinking 
and  working  within  a  PGPM 
environment. 

In  order  to  transfer  the 
technology,  process  and  culture  for  a 
Cleanroom  environment,  four  different 
tools  were  employed: 

(1)  training,  in  a  formal  classroom 
setting  which  integrated  lecture 
material  and  numerous  hands-on 
workshops  (tailored  for  this 
effort), 

(2)  coaching,  both  for  project 
planning  and  execution  as  well  as 
a  medium  to  promote  ongoing 
education, 

(3)  process  handbooks  (evolved  for 
this  effort),  which  act  as  a  written 
source  of  education  material  and 
as  a  reference  during  project 
execution,  and 

(4)  an  automated  process  support 
system  (developed  for  this  effort), 
that  helps  enforce  process 
adherence  and  monitors  task 
completion,  by  automating  non 
creative  ta.sk.s. 


125 


Baseline  Metrics  for 
Control  Group  Projects 

The  control  groups  represent  the 
state-of-the-praetice  at  the  AMCCOM 
LCSEC.  Baseline  metrics  were  collecUid 
in  order  to  gain  insight  into  project 
practices  and  to  establish  a  basis  of 
comparison  to  the  demonstration 
Cleanroom  groups.  Table  II  presents  the 


baseline  metrics  for  the  contrr)!  group. 
The.se  metrics  are  presented  with  the 
caution  that  some  data  collection 
mechanisms  are  unreliable,  resulting  in 
inaccuracies.  The  numbers  in  Table  11  are 
similar  to  results  reported  by  Mosemann 
for  other  projects  within  the  DoD  |  Ada 
and  C++:  A  Business  Case  Analysis. 
July  1991). 


Table  II:  Baseline  Metrics  for  Control  Group  Projects 


j 

Project  /  Measure 

Control  Group  Projects 

Number  of  Projects 

5 

Range  of  Effort  -  Staff  Months 

21-58 

Total  Technical  Staff  Months 

192 

Total  KLOC  (*) 

23.14 

DERIVED  METRIC: 

Productivity  -  LOC/Staff  Month 

121 

(*)  KLOC  computed  using  NASA/Goddard  formula  of: 

(  New  Lines  of  Code  +  0.2  *  Modified  Lines  of  Code  )  / 1000 


Observations 

The  following  observations  are  a 
compilation  of  IBM  and  SET  experiences 
with  the  MBC  teams.  These  observations 
are  in  the  context  of  IBM's  and  SET's 
other  experiences  with  replacing  craft- 
based  practices  with  engineering-based 
practices,  both  in  the  private  sector  and 
with  government  organizations.  One 
must  keep  in  mind  these  observations  are 
preliminary  since  the  project  has  not  been 
completed. 

1.  The  assigned  project  teams 
•were  able  to  assimilate  and  even  adapt  the 
Cleanroom  Software  Engineering 
practices  and  process-guided  project 
management. 

A  common  worry  among 
managers  when  hearing  about  Cleanroom 
is  that  it  is  too  hard  or  too  mathematical 
for  their  staff.  At  Picatinny,  engineers 
were  able  to  apply  and  adapt  the 


Cleanroom  practices  to  the  needs  of  their 
project.  Engineers  learned,  used  and 
extended  the  ideas  successfully  for  their 
project.  The  evidence  of  this  observation 
is  the  products  they  have  produced. 

Disciplined  engineering  in  a  team 
environment  requires  rigor,  cooperation 
of  individuals,  and  the  creativity  to  apply 
theory  to  real  world  problem*;.  This 
creates  a  challenging  work  environment 
that  tends  to  bring  out  the  best  in  both 
individuals  and  teams. 

A  prime  example  of  the 
accomplishments  of  the  MBC  team  was 
the  tailoring  of  the  box  structures 
algorithm  to  meet  both  their  application 
environment  and  the  target  programming 
language,  Ada.  MBC  team  members 
have  made  original  contributions  to  the 
expression  of  box  structure  constructs  in 
Ada,  which  will  have  applicability  across 
many  Cleanroom  projects.  This  has 
benefited  both  the  project,  in  terms  of 


constructive  methods,  and  the  individual 
team  membeis,  in  terms  of  a  sense  of 
accomplishment.  The  team  has  enjoyed 
using  the  various  Cleanroom  techniques 
and  have  seen  many  real 
accomplishments.  The  specification  team 
is  convinced  that  this  is  the  most  complete 
and  precise  specification  they  have  ever 
written.  The  step-wise  refinement  and 
verification,  which  drives  engineers  to 
define  one  small  step  to  take  at  a  time, 
take  that  step,  and  then  confirm  its 
correctness,  has  also  been  successful. 
The  development  team  is  convinced  that 
they  have  a  great  design  and  have 
minimized  the  amount  of  code  they  need 
to  develop. 

Furthermore,  as  the  MBC  team 
has  almost  completed  their  second 
increment,  it  has  already  shown  major 
gains  in  productivity.  Early  estimates 
show  that  productivity  has  tripled  despite 
the  learning  curve  of  working  with  a  new 
methodology.  Moreover,  this  measure 
includes  time  spent  toward  an  entire 
product  specification,  which  will  make 
future  increments  less  time  consuming. 
Thus,  team  members  are  optimistic  about 
continued  increases  in  their  productivity 
(although  future  predictions  can  only  be 
assertions  and  remain  to  be  confirmed  at 
project  completion). 

2.  Staff  morale  has  improved  on 
the  project  teams. 

Another  common  fear  of 
managers  when  hearing  about  Cleanroom 
is  that  their  staff  members  will  not  like  it 
due  to  the  rigor  of  the  process  and  the 
absence  of  positive  feedback  through 
debugging.  This  has  not  been  our 
experience  at  other  places  where  we  have 
introduced  Cleanroom.  Picatinny  is  no 
exception.  When  an  organization 
replices  craft-based  practices  with 
engineering-based  practices,  morale 
improves.  The  reason  seems  to  be  that 
people  now  know  what  to  do,  when  to  do 
it,  and  how  it  should  be  done.  This 
eliminates  the  uncertainty  and  anxiety  for 
project  teams  that  now  have  a  good 
knowledge  of  exactly  what  has  been  done 


a.id  what  is  remaining  to  be  completed. 
When  engineers  learn  to  use  the 
Cleanroom  practices,  they  know  they  can 
do  the  high  quality  job  they  have  been 
striving  to  achieve.  Engineers  are 
convinced  that  they  are  producing  a  better 
product.  As  a  result,  they  arc  excited 
about  it. 

At  the  AMCCOM  LCSEC,  aU  the 
engineers,  both  in  informal  contacts  and 
in  a  questionnaire  distributed  to  the 
engineering  staff,  reported  morale 
improvements.  The  AMCCOM  LCSEC 
management  has  also  confirmed  the 
existence  of  the  improved  morale  and,  of 
course,  is  favorably  impressed. 

3.  Facilitation  of  work  effort  is 
greatly  enhanced  through  process-guided 
project  management. 

Team  leaders  managed  by  process 
definition  and  task  lists  which  allowed 
more  visibility  of  project  status  by 
management.  (TEPA  was  the  tool  used  to 
clearly  define  the  tasks  necessary  for  the 
software  development  process  to  continue 
including  completion  conditions  for  each 
task  and  the  control  flow  that  dictated  the 
order  of  work  on  each  task.  The  intent 
with  process-guided  project  management 
is  to  give  engineers  a  clear  and 
understandable  road  map  which  they  can 
follow  and  by  which  they  may  track 
progress  towards  project  completion. 
Awareness  of  software  process  is  a  key 
issue  in  successfully  transferring 
technology  to  an  organization  and  to  an 
organization's  long  term  success  with 
applying  a  given  process-driven  approach 
to  project  management.  PGPM  must 
provide  for: 

•  the  reduction  of  time  lost  because 
supporting  activities  are  not  properly 
coordinated, 

•  enable  engineers  to  follow  the  defined 
process  and  thereby  obtain  all  of  its 
benefits 

•  enforce  the  defined  process  in  the 
most  unobtrusive  way  possible  by 
being  user-friendly 
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•  enable  all  levels  of  management  to 
plan,  schedule  and  control  project 
tasks 

•  facilitate  the  collection  of  all  required 
metrics  for  providing  statistical 
control  of  the  process  and  for 
providing  better  estimates  of 
development  time  and  cost 

•  update  on-line  state  data,  the  data 
needed  to  develop  the  product,  and 
make  it  immediately  available  to  all 
members  of  the  project  team. 

•  improve  formal  and  informal 
communication  between  the  members 
of  the  group. 

4.  The  team-oriented  approach  of 
CSE  saw  immediate  acceptance  and 
realized  both  tangible  and  intangible 
benefits. 

A  key  ingredient  of  Cleanroom  is 
that  a  team  amplihes  human  performance. 
People  took  advantage  of  the  insight  of 
others  in  order  to  bring  the  best  possible 
project  result.  Good  people  working 
together  brought  in  better  results.  The 
simple  idea  that  many  minds  are  better 
than  one  makes  the  outlook  for  quality 
good.  However,  some  less  tangible 
benefits  were  realized  as  well.  The  fact 
that  the  entire  team  is  responsible  for 
quality,  in  a  series  of  checks  and  reviews, 
puts  pressure  on  the  team  and  not  on 
individuals.  This  pressure  creates  a 
reliance  on  team  activity  over  individual 
performance.  Furthermore,  as  successes 
are  encountered,  the  entire  team  takes 
credit,  not  a  single  individual,  thus, 
cementing  the  teamwork  concepts.  The 
bottom  line  is  that  teamwork  improves 
individual  performance. 

Our  observation  is  that  the  MBC 
team  now  works  within  an  effective  team- 
oriented  environment.  We  believe  that 
further  use  of  Cleanroom  will  establish  a 
strong  team  mentality  that  will  serve  to 
further  improve  the  initial  good  results. 

5.  Coaching  is  a  key  ingredient  of 
technology  transfer  success. 


Although  the  training  was 
rigorous  with  a  mixture  of  theory  and 
workshops,  students  learn  at  different 
rates.  Coaching  allowed  IBM  and  SET 
staff  to  re-educate  the  slower-to-adopi 
project  staff  members  and  keep  the  entire 
team  on  a  common  level  of  knowledge 
and  expertise.  IBM  and  SET  technical 
presence  at  project  inception  and  during 
project  execution  helped  solidify  the 
transfer  of  the  technology  and  ensured 
that  the  project  got  started  in  the  most 
efficient  manner. 

Furthermore,  there  was  a  gap 
between  the  end  of  training  and  the  start 
of  the  project  and  some  of  the  education 
was  forgotten.  Coaching  became  the 
mechanism  to  re-educate  and  supplement 
the  original  training.  Further,  as  good 
ideas  were  conceived  by  some  team 
members,  it  was  possible  to  see  that  all 
members  were  supplied  with  the  new 
ideas. 

As  the  project  progressed,  the 
CSE  ideas  needed  to  be  adapted  to  the 
specific  Picatinny  environment.  Coaches 
were  used  to  discuss  design  alternatives 
and  to  help  in  refining  the  technology  to 
best  serve  the  application. 

Perhaps  the  most  unnoticed  but 
effective  use  of  coaching  was  in  the 
positive  reinforcement  the  CSE  trainers 
were  able  to  give  to  the  team  members 
and  the  team  as  a  whole.  Coaches  are 
recognized  as  experts.  When  experts 
comment  positively  on  original  ideas  by  a 
team  member,  the  effect  can  be  enormous 
in  terms  of  self-esteem  and  sense  of 
accomplishment  and  contribution.  The 
CSE  trainers  tried  to  positively  reinforce 
the  behavior  of  those  making  such 
contributions  and  encourage  others  to 
seek  answers  beyond  the  limits  of  current 
knowledge.  The  "cheer  leading" 
approach  increased  project  satisfaction, 
which  motivated  greater  project 
performance. 

The  idea  of  coaching  with  positive 
reinforcement  was  first  formally  tried  out 
by  IBM  and  SET  on  the  Picatinny  project 
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based  on  the  hypothesis  that  it  would  be 
helpful  in  technology  transfer.  The 
realized  benefits  far  exceeded  our 
expectations.  Based  on  this  experience,  it 
is  now  believed  that  coaching  should  be  a 
formal  part  of  any  technology  transfer 
effort. 

6.  Communication  among  teams 
(and  between  team  members)  is  greatly 
enhanced  through  process-guided  project 
management. 

An  important  ingredient  of  any 
process-guided  activity  is  communication 
among  contributing  teams  and 
individuals.  One  aspect  of  this  was  that 
no  team  culture  existed  at  the  AMCCOM 
LCSEC;  meaning  that  no  real  notion 
existed  of  how  teams  are  supposed  to 
behave  during  project  execution.  This 
problem  manifested  itself  in  many 
different  ways.  Testing  teams  often  did 
not  receive  specification  updates  (and 
failed  to  ask  for  them).  Also,  work 
tended  to  be  duplicated  by  multiple  team 
members  because  the  division  of  tasks 
was  unclear  and  communication  among 
members  occurred  too  seldom. 

There  were  two  aspects  of  solving 
this  problem  at  Picatinny.  The  first  was 
to  establish  effective  communication 
among  team  members  and  the  second  was 
to  establish  communication  among  the 
different  departments  involved  in  the 
project.  The  adoption  of  a  well  defined 
process  includes  a  vocabulary  that  is  of 
great  help  to  the  understanding  and 
discussion  of  the  process.  This  well 
defined  vocabulary  makes  communication 
between  team  members  much  more 
effective  and  productive.  Our 
observation  indicates  that  communication 
among  team  members  significantly 
improved  via  the  team  approach  and 
strengthened  through  the  use  of  CEPA. 
The  improved  communication  also  started 
a  shift  in  the  culture  of  the  teams.  Team 
members  report  that  they  readily  use  each 
other  as  information  sources,  quality 
checks,  etc.  Team  reviews  are  effective 
and  informative.  However,  the  second 
aspect,  communication  between 


departments,  continues  to  be  a  problem. 
The  MBC  certification  team  members 
work  for  a  different  department  than  the 
specification  and  development  teams. 
Resulting  problems  are  that  the 
certification  team  finds  themselves 
working  from  outdated  specifications. 
Furthermore,  the  certification  team  seems 
to  duplicate  each  other's  work.  A  future 
goal  is  to  be  able  to  duplicate  the  success 
of  the  specification  and  development 
teams  in  the  certification  team,  primarily 
by  improving  communications.  A  more 
concerted  effort  should  have  been  made 
by  the  coaches  to  minimize  these 
communications  problems. 

7.  Process-guided  project 
management  supports  engineers  in 
mastering  a  new  technology. 

Process-driven,  now  referred  to 
as  "Process-guided,"  project  management 
is  one  of  the  two  basic  technologies  being 
advanced  by  the  STARS  program.  The 
Picatinny  project  was  the  first  project  on 
which  this  key  idea  has  been  employed. 

The  reason  process-guided  project 
management  seems  to  support  technology 
transfer  can  be  summarized  as  follows. 
When  doing  something  for  the  first  time, 
one  often  asks,  "What  do  I  do  next?"  or 
"When  will  I  be  done?"  This  indicates  a 
lack  of  understanding  the  big  picture, 
where  engineers  can  clearly  place  their 
efforts  in  a  project  context.  This  is  not 
only  an  attribute  of  first  time  usage  of 
techniques  or  a  process,  but  also  an 
indication  that  a  clearly  defined  process 
does  not  exist  or  is  not  effectively 
managed. 

By  placing  the  Cleanroom 
techniques  within  a  fully  defined  process, 
AMCCOM  LCSEC  engineers  knew 
precisely  what  step  they  were  currently 
on,  as  well  as  what  had  been  completed 
and  what  remained  to  be  done.  Giving 
each  individual  the  foresight  that  showed 
where  they  were  in  the  context  of  the 
entire  project  strengthened  project  identity 
and  boosted  morale. 
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The  results  of  this  project  also 
indicate  that  experienced  engineers  will 
also  gain  productivity  benefits  by 
employing  a  process  support  system  but 
that  can  only  be  tested  by  comparing  the 
performance  of  experienced  teams  which 
was  not  possible  at  Picatinny. 

8.  CEPA,  the  Cleanroom 
Engineering  Process  Assistant,  despite 
some  shortcomings,  provided  valuable 
process-guidance  support  for  the  project. 

There  were  a  number  of  known, 
as  well  as  discovered,  shortcomings  in 
developing  and  using  CEPA.  It  was  an 
enhancement  of  a  prototype  system 
developed  during  a  previous  STARS 
Phase.  The  enhanced  system  was  to 
provide  support  to  engineers  using  a 
specific  Cleanroom  process  model.  This 
approach  was  known  to  be  somewhat 
limiting,  but  was  used  in  order  to 
determine  the  level  of  constraint 
necessary  for  engineers  to  easily  adopt 
process-guided  engineering.  Although  the 
engineers  did  report  finding  the  product 
constraining,  CEPA  did  allow  engineers 
to  identify  the  tasks  assigned  to  them  and 
locate  all  files  necessary  to  complete  the 
tasks.  Team  leaders  could  also  focus 
their  management  effort  based  on 
assigned/outstanding  and  completed 
tasks.  This  status  reporting  feature 
allowed  team  leaders  to  manage  project 
tasks  at  a  more  reasonable  level  of 
granularity,  which  permitted  them  to 
maintain  the  project  under  greater 
intellectual  control. 

CEPA  was  viewed  as  being 
tightly  coupled  with  the  process.  As  a 
result,  formal  training  in  using  it  was  not 
given,  which  would  have  also  made  its 
use  more  effective.  The  lack  of  CEPA 
training  was  a  significant  shortcoming 
that  needs  to  be  rectified  in  future 
technology  transfer  efforts.  Additionally, 
formal  training  in  using  the  underlying 
tools  in  CEPA  would  have  been  useful. 
Other  problems  with  the  CEPA 
implementation  used  at  Picatinny  included 
a  clumsy  user  interface  and  difficulties  in 
using  the  software  on  a  network. 


It  was  observed  that  automated 
process  support  is  quite  helpful  in 
supporting  technology  transfer.  This  is  in 
spite  of  some  of  the  shorteomings  of  the 
system  that  the  MBC  staff  was  asked  to 
use.  The  developers  of  CEPA  learned  a 
great  deal  about  how  people  use  such  a 
system;  and  consequently,  requirements 
for  an  enhanced  process  support  system 
were  identified  and  modified.  The 
automated  process  support  system  that  is 
to  be  transferred  to  Peterson  Air  Force 
Base  has  been  improved  as  a  result  of  this 
usage. 

9.  Specific  technological  aspects 
of  the  Cleanroom  Software  Engineering 
practices  were  easily  and  successfully 
used. 

Using  specific  techniques  are 
means  by  which  engineers  change  their 
behavior  and  improve  their  performance. 
Three  techniques  in  specific  were 
discussed  by  project  staff  as  being  major 
sources  of  their  improved  performance. 
These  techniques  are  team  reviews, 
Cleanroom  specifications  and  box 
structured  design,  and  are  described  in 
greater  detail  below: 

Team  reviews,  although 
experiencing  a  slow,  awkward  start,  were 
cited  by  team  members  as  one  of  the  most 
successful  aspects  of  the  new  activity. 
Members  report  that  the  team  shared 
responsibility  eased  misgivings  about 
participating  in  such  a  big  project.  This 
negated  "finger  pointing"  that  existed  in 
previous  projects  and  allowed  even 
difficult  personality  combinations  to  work 
together.  The  result  was  that  everyone 
participated  and  worked  as  a  team  toward 
project  success  and  completion.  Morale 
increased  sharply  as  groups  of 
individuals  transformed  into  an  effective 
software  team. 

Cleanroom  specification, 

most  notably  black  box  documentation, 
was  cited  as  being  responsible  for  gains 
in  productivity.  Many  talented  engineers 
existed  on  the  project  and  their 
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productivity  was  significantly  enhanced 
when  working  from  a  well  defined 
problem  statement.  The  completeness  of 
the  specification  was  the  main  reason 
cited  for  the  team's  confidence  that  they 
were  producing  a  high  quality  product 

Box  structured  design  is 
credited  with  focusing  the  code 
generation  process  and  with  making  team 
reviews  more  effective.  The  team 
enjoyed  the  orderly  process  of  developing 
software.  It  got  them  started  more 
quickly  on  solving  a  particular  problem 
and  they  were  able  to  measure  the 
progress  of  the  development  activity  with 
more  precision  than  in  the  past.  Since  the 
process  relies  a  great  deal  on  logical 
thinking  as  opposed  to  programming 
skill,  less  experienced  programmers  are 
able  to  take  a  bigger  share  of  the 
development  burden.  Therefore  software 
engineers  can  make  the  most  of  their 
software  engineering  skills  without 
having  to  develop  in-depth  programming 
language  expertise. 


Results 

The  most  important  result  noted 
by  this  effort,  even  in  its  preliminary 
form,  is  that  the  motivation  to  continue  to 
use  Cleanroom  practices  and  PGPM  at 
Picatinny  has  been  established.  This 
demonstration  effort  was  sponsored  by 
STARS  and  the  continued  effort  is  being 
sponsored  by  the  AMCCOM  LCSEC. 
This  result  is  an  instance  of  the  STARS 
program  fulfilling  its  mission  by  being 
the  catalyst  for  introducing  improvements 
to  the  software  engineering  capabilities  in 
the  DoD.  In  one  sense,  the  effort  is  to  be 
expanded  across  the  entire  organization. 


is  the  most  definitive  conclusion  of  this 
effort. 

In  addition  to  the  above 
mentioned  conclusion  to  this  effort,  the 
following  conclusions  can  be  drawn 
based  on  the  current  status  of  the  MBC 
project. 

1.  It  is  possible  to  transfer  CSE 
and  PGPM  practices  to  project  teams 
operating  within  a  typical  immature  DoD 
SSA  organization. 

This  was  shown  by  the  fact  that 
the  MBC  project  has  progressed  to  a 
point  where  CSE  and  KjPM  are  being 
successfully  applied.  This  result  shows 
that  a  specific  maturity  rating  is  not 
necessary  in  order  to  benefit  from 
Cleanroom  Software  Engineering  or 
Process-Guided  Project  Management. 
The  engineering  staff  also  enjoyed  using 
the  ideas,  and  all  were  interested  in  using 
the  ideas  again.  Additionally,  nearly  all 
were  interested  in  supporting  and 
participating  in  the  establishment  of  a 
"Cleanroom  Competency  Center"  at  the 
Picatinny  Arsenal. 

2.  Typical  immature  DoD  SSA 
organizations  can  realize  important 
benefits,  in  terms  of  improved  process 
productivity,  product  quality,  and  staff 
morale,  from  the  application  of  CSE  arid 
PGPM. 

This  conclusion  is  supported  by 
the  apparent  triphng  of  productivity  of  the 
MBC  team.  Table  III  shows  the  results 
for  productivity  for  increments  1  and  2. 
The  productivity  increases  because  the 
training  and  learning  curve  are  not 
required  for  successive  increments. 
Future  increments  should  show  additional 
improvement 
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Table  ill:  Productivity  Change  for  MBC  Re-engineering 


Increment 

LOC  per 
Staff 
Month 

Change  in  Productivity 
based  on  Picatinny 
Baseline  Metrics 

1 

370 

3.06:1 

1+2 

428 

3.54:1 

Two  important  observations  from 
the  MBC  project  are  that  (1)  PGPM  has 
aided  the  learning  process  and  helped 
ease  the  transfer  and  application  of  the 
CSE  technology  and  (2)  following  a  well 
defined  process  significantly  improves 
team  productivity  and  morale. 

Early  indication  are  that  quality 
has  appeared  to  have  improved  over 
previous  product  quality  according  to 
Picatinny's  customers.  The  quality  of  the 


software  developed  during  the  first 
increment  is  very  high  when  compared  to 
quality  for  traditional  software 
development.  The  MBC  team  is  excited 
about  the  prospect  of  the  upcoming  test  of 
their  second  increment.  Thus,  the  result 
achieved  will  be  viewed  by  the  MBC 
team  as  the  mark  to  better  on  the  next 
increment  of  this  project.  The  incentive 
and  motivation  for  continual  improvement 
is  firmly  in  place  among  MBC  team 
members. 


Table  IV:  Quality:  Number  of  First  Increment  Failures 


Failure 

Type 

Number  of 
Failures 

Description 

Process 

19 

Approved  improvements  made  to  design  but  not  reflected 
in  updates  to  the  specifications. 

Spelling 

3 

Misspellings  on  displays. 

Behavior 

2 

Coding  Errors,  did  not  work  as  specified. 

First  Increment  Failure  Rate  =  2  Behavior  Failures  =  0.24  Failures/KLOC 

8500  LOC 


3.  The  return  on  investment  at 
Picatinny  cannot  be  definitively 
calculated,  but  indications  are  that  there  is 
a  significant  return  on  investment. 

Since  the  project  is  not  yet 
complete,  a  preliminary  estimate  of  return 


on  investment  can  only  be  based  on 
estimates  from  the  information  currently 
available.  The  resulting  return  on 
investment  (ROI)  calculations  appear  on 
the  next  page  in  Table  V. 
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Table  V:  Return  on  Investment  for  MBC  Re-engineering 


Increment 

LOC 

per 

Staff 

Month 

Staff  Months  MBC 
would  have  taken 
without  CSE 

ROI 

Base 

ROI 

with 

Coaching 

ROI 

with 

Training 

1 

370 

8500LOC/121  LOaSM  = 
70.3 

5.9;  1 

8.9:1 

14.1:1 

1+2 

428 

18200  LOai21  LOOSM  = 
150.4 

9.8:1 

20.4:1 

17.1:1 

SM  =  Staff  Months 

ROI  Base  =  (SM  MBC  would  have  taken  -  SM  MBC  took) 

Staff  Months  of  Coaching  Effort  +  Staff  Months  spent  in  Training 

ROI  with  Coaching  =  (SM  MBC  would  have  taken  -  SM  MBC  took  -  SM  of  Coaching  Effort) 

Staff  Months  spent  in  Training 

ROI  with  Training  =  (SM  MBC  would  have  taken  -  SM  MBC  took  -  SM  spent  in  Training) 

Staff  Months  of  Coaching  Effort 


If  these  assertions  are  correct,  one 
must  also  realize  that  productivity  will 
increase  with  the  later  increments  b^ause 
specifications  are  complete  for  the  entire 
system.  Once  again,  the  final  calculation 
of  return  on  investment  awaits  project 
completion. 

4.  An  Automated  Process 
Support  System  (PSS),  that  is  consistent 
with  the  process  defined  for  the  project, 
facilitates  technology  transfer 

Automating  the  non-creative  tasks 
of  a  new  technology,  such  as  file  access 
and  simple  process  flow  facilitates  the 
adoption  of  the  new  technology.  This 
was  true  even  for  a  system  with 
limitations  known  and  subsequently 
observed  in  CEPA.  CEPA's  successor 
system  (being  developed  for  deployment 
on  the  STARS  Air  Force  demonstration 
project)  applies  many  of  the  lessons 
learned  from  observing  CEPA  use  at 
Picatinny. 


5.  Based  on  this  demonstration 
we  now  believe  that  a  technology  transfer 
program  to  support  individual  projects  at 
a  typical  DoD  SSA  organization  must 
begin  with  a  defined  process  for  the 
project  and  should  consist  of  the 
following  five  components: 

(1)  formal  CSE  training, 

(2)  training  in  PGPM, 

(3)  the  availability  of  engineering 
handbooks, 

(4)  the  use  of  a  PSS  (e.g.,  CEPA 
and  its  successor),  and 

(5)  the  availability  of  qualified 
coaching. 

The  combination  of  technology 
transfer  components  created  a  series  of 
successes  at  Picatinny;  including 
productivity  gains,  expected  quality 
gains,  and  the  increased  motivation  of  the 
engineering  staff. 

The  MBC  project  has  realized 
significant  gains  from  the  CSE  ideas. 
Once  the  learning  curve  had  been 
completed;,  initial  successes  in  creating 
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the  Black  Box  specil'icalion  served  lo 
cement  commitment  to  CSE. 

The  resulting  conclusions  from 
the  overall  evaluation  are  preliminary 
because  the  demonstration  project  is  still 
in  its  early  stages.  However,  the  original 
hypothesis  that  Cleanroom  improves  the 
effectiveness  of  the  software  re¬ 
engineering  activities  at  Picatinny  looks 
very  promising.  Indeed,  management 
and  staff  agree  that  morale  and  motivation 
is  extremely  high,  that  teamwork  is  now 
the  normal  mode  of  operation,  and  that 
people  are  excited  about  the  software 
process  being  established  and  are 
motivated  to  produce  high  quality 
products. 

A  good  technical  road  map  is  in 
place  at  Picatinny;  the  technical  personnel 
are  developing  the  skills  that  appear  to 
show  significant  gains  in  productivity. 
Even  more  promising  is  the  fact  that  these 
gains  were  made  with  minimal  exposure 
to  CSE.  Future  gains  are  likely  to  be  of 
greater  magnitude  as  projects  are  carried 
out  by  experienced  teams  of  Cleanroom 
engineers  well  advanced  on  the  learning 
curve. 


Future  Directions 

The  next  steps  for 
LCSEC/STARS  cooperation  have  been 
defined.  The  impressive  results  to  date  in 
the  areas  of  productivity,  quality,  return 
on  investment  and  moral  have  convinced 
AMCCOM  LSCEC  management  lo 
continue  the  work  begun  under  this 
demonstration  project  and  to  expand  it 
further  throughout  the  organization  and 
this  will  include  the  following; 

Complete  remaining  phases  of 
MBC  project  of  which  there  are  three 
increments  in  the  initial  plan.  This  will 
result  in  a  completely  re-engineered 
system. 

Gain  insight  from  a  second 
experimental  project,  the  Bradley 


Institutional  Conduct  of  Fire  Trainer,  a 
software  block  update  of  a  FORTRAN 
.system.  This  project  is  a  maintenance 
software  block  update  for  the  aiming 
system  on  the  Bradley  fighting  vehicle. 
CSE  and  PGPM  techniques  are  being 
adapted  for  use  in  this  important  area. 
Much  needs  to  be  learned  in  adopting  the 
successful  techniques  used  on  the  MBC 
to  the  area  of  software  maintenance. 

Creating  a  Cleanroom  Process 
Team  to  build  CSE  and  process  expertise 
in  house.  This  group  is  responsible  for 
continuous  review  and  study  of  the 
application  of  Cleanroom  and  PGPM  at 
Picatinny.  This  group's  charter  is  to 
internalize  and  refine  CSE  and  PGPM  to 
the  software  practices  at  Picatinny. 

Planning  to  use  CSE  for  all  future 
development  and  re-engineering  projects 
internal  to  the  LCSEC.  It  is  cost  effective 
to  use  these  techniques  for  any 
maintenance  project  that  requires  effort 
equal  to  changing  one  third  or  more  of  the 
original  code.  Since  productivity 
increases  are  three  times  conventional 
methods,  a  whole  system  can  be  re¬ 
engineered  for  the  same  cost  or  less  of  a 
maintenance  update.  The  technology 
transfer  package  developed  by  IBM  and 
SET  can  be  used  by  the  trained  staff  to 
support  these  projects.  The  MBC  team  is 
fortunate  to  have  talented  team  members 
who  can  carry  this  out. 

Evolve  the  Process  Support 
System.  The  understood  and  observed 
shortcomings  in  CEPA  are  primarily 
addressed  by  the  PSS  being  developed  by 
the  IBM  STARS  team  for  use  on  the 
STARS  Air  Force  Demonstration  Project 
at  Peterson  Air  Force  Base.  The  major 
source  that  provided  input  to  the 
specification  process  for  the  PSS  was  the 
experience  gained  by  using  CEPA  at 
Picatinny.  As  a  result,  many  of  the 
improvements  desired  by  Picatinny  are 
being  developed  as  a  part  of  the  PSS. 
The  PSS  needs  to  be  delivered  to 
Picatinny,  as  well  as  to  Peterson  Air 
Force  Base.  As  a  rc.sult,  Picatinny  will 
leccive  their  desired  functionality  and  will 
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help  provide  a  second  test  bed  for  the 
PSS. 


Evolve  Cleanroom  Software 
Engineering  into  a  complete  life  cycle 
process  as  far  as  is  feasible  using 
standard  Cleanroom  techniques.  Work  is 
currently  underway  to  perform  a  mapping 
of  the  evolved  Cleanroom  Software 
Engineering  process,  as  used  at 
Picatinny,  against  SEI  developed 
Software  Process  Frameworks  (SPF)  for 
the  SEI  CMM  Level  2  and  3.  The  results 
of  this  mapping  will  provide  Picatinny 
with  a  road  map  of  all  areas  that  are  either 
not  covered  or  are  poorly  covered  by  the 
CSE  process.  These  results  will  be  used 
to  identify  areas  for  Cleanroom  process 
definition  as  far  as  is  practical.  Ai^that 
are  not  practical  for  Cleanroom  extension 
will  be  defined  with  non  Cleanroom 
extensions.  The  results  will  be  a 
complete  life  cycle  process  definition  for 
Picatinny  to  support  their  move  to  SEI 
CMM  level  3.  Additional  work  is  being 
done  in  the  area  of  Metrics  to  determine  a 
baseline  of  measures  to  support  a  level  3 
organization. 
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Abstract 

This  paper  illustrates  two  system  definition 
methodologies  developed  by  the  Engineering  of  Complex 
Systems  (ECS)  project.  The  methodologies  are  the 
System  Design  Views  and  the  System  Design  Factors. 
These  methodologies  are  used  to  provide  a  detailed 
characterization  of  the  hardware  components  of  the 
AN/B(5G-5  Wide  Aperture  Array  (WAA)  system.  The 
system  hardware  components  are  characterized  using  the 
Informational.  Functional  and  Implementational  System 
Design  Views.  Properties  including  performance, 
physical  attributes  and  future  needs  of  the  system  are 
presented  using  the  hierarchical  System  Design  Factors 
approach.  The  benefits  and  utility  of  the  information  in  a 
system  reengineering  environment  are  discussed. 

1:  Introduction 

One  area  of  evolving  system  engineering 
methodologies  for  complex  systems  focuses  on  providing 
a  comprehensive  system  definition  early  in  the  design 
cycle.  Commercially  available  system  engineering  tools 
such  as  RDD-1(X)  allow  system  designers  the  facility  to 
generate  various  design  o^ture  views.  The  views  provide 
the  mechanism  to  completely  define  system  requirements, 
notional  architectures  and  system  behavior.  The  goal  is 
to  mitigate  risk  by  validating  candidate  architectures  and 
identifying  potential  problems  prior  to  building  a 
prototype  of  the  system. 

When  beginning  to  specify  the  requirements  for  a 
complex  military  system  replacement  or  upgrade,  it  is 
important  to  characterize  the  existing  system.  The 
designer  uses  the  information  obtained  from  the 
characterization  of  the  baseline  system  in  two  valuable 
ways: 

1 .  To  identify  areas  which  are  candidates  for 
improvement  in  the  new  system  design. 


2.  To  provide  a  basis  of  comparison  between 
the  existing  system  and  the  new  system. 


For  example,  cost  versus  performance  may  be  the 
primary  tradeoff  in  design  of  the  new  system.  The 
baseline  characterization  of  the  existing  system  may 
reveal  that  the  ratio  of  cost  to  throughput  is  high.  In  this 
case  the  designer  must  ensure  that  the  cost  to  throughput 
ratio  for  the  new  system  does  not  exceed  that  of  the 
existing  system. 

The  preceding  example  illustrates  how  specific 
information  obtained  from  a  characterization  of  a  baseline 
system  can  be  used  to  define  and  evaluate  a  new  system 
design.  The  example  focused  on  a  specific  aspect  (cost  vs. 
performance)  of  the  system  delinition.  It  is  clear  that  the 
consideration  of  only  this  aspea  provides  a  narrow  and 
incomplete  definition  of  the  system. 

As  part  of  the  Engineering  of  Complex  Systems 
(ECS)  project,  scientists  at  the  Naval  Surface  Warfare 
Center  (NSWC)  are  actively  pursuing  research  in  the 
areas  of  system  definition  using  several  design  capture 
views.  The  following  design  capture  views  and  objectives 
of  each  view  have  been  defmed  by  N.  T.  HoangL 

1.  Informational 

•  Characterizes  system  concept  of  operations 

•  Represents  system  in  abstract  terms 

2.  Functional 

•  Defmes  system  functions  and  decompositions 

•  Defines  data  flow  requirements 

3.  Behavioral 

•  Defines  system  critical  paths 

•  Specifies  system  real-time  characteristics 

4.  Implementational 

•  Define  physical  hardware,  software,  and  human 

resources 
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•  Specifles  system  physical  interconnectivity 

S.  Environmental 

•  Establishes  conditions  and  events  constraining 
system  opoations 

•  Specifles  perfonnance  MOEs  and  conditions  of 
measurement 

Within  these  design  capture  views  NSWC  researchers 
have  deflned  a  set  of  System  Design  Factors^(SDF).  The 
SDF  represent  a  proposed  standard  methodology  to 
describe  the  propenies,  attributes  and  characteristics  of  the 
system.  The  SDF  can  also  be  used  to  quantify  specific 
aspects  of  the  design  and  to  perform  trade-off  analysis. 

This  report  presents  a  byline  characterization  of  the 
AN/BQG-5  Wide  Aperture  Array  (WAA)  system 
hardware.  The  goal  of  this  effort  in  an  overall 
reengineering  environment  is  to  provide  a  detailed 
baseline  to  be  used  for  comparison  a^  trade  off  analysis 
between  relevant  aspects  of  the  existing  system  and  the 
proposed  system.  In  the  following  case  study  the 
characterization  of  an  existing  system  will  be  represented 
using  the  Informational.  Functional,  and 
Implementational  design  capture  views.  In  addition. 
System  Design  Factors  in  the  areas  of  performance, 
physical  attributes,  and  future  needs  will  be  used  to 
describe  the  baseline  system  hardware. 

2:  System  Design  Views 

The  following  three  System  Design  Views  are  the 
Informational,  Functional  and  Implementational  views. 
The  views  focus  on  the  hardware  components  of  the 
system.  Each  view  contributes  a  different  aspect  of  the 
overall  system  description.  Collectively  the  design 
ciq)ture  views  provide  a  detailed  characterization  of  the 
AN/BQG-5  WAA  system  hardware.  However,  the 
system  hardware  definition  provides  an  illustration  of 
only  one  aspect  of  the  research  being  conducted  by  the 
ECS  project. 

2.1:  Informational 

This  design  view  provides  a  textual  description  of  the 
overall  system  functions  and  simple  concept  of 
operations.  In  addition  some  hardware  specific 
information  is  included.  The  information  in  this  view  is 
intended  to  familiarize  the  designer  with  the  functionality 
and  use  of  the  baseline  system. 

The  AN/BQG-5  WAA  is  a  stand  alone  acoustic 
sensing  and  tracking  system  designed  to  augment  the 
combat  system  in  688  class  attack  submarines.  The 


WAA  system  provides  the  capability  to  acquire,  track  and 
rapidly  analyze  an  acoustic  contact  and  conduct  Target 
Motion  Analysis  (TMA)  without  maneuvering  the  ship. 
The  inboard  system  hardware  consists  of  3  Common 
Electronic  Equipment  Enclosures  (CEEE).  The  units  are 
the  WAA  Receiver  (1306),  WAA  Beamfonner  (1307)  and 
WAA  Processor  (1308).  A  t»ief  description  of  each  unit 
and  the  unit  resident  firmware  follows. 

The  WAA  Receiver  cabinet  receives  multiplexed 
Manchester  coded  serial-bit  format  data  transmitted  via 
twisted  shielded  pairs  of  wire  from  the  canisters  in  the 
WAA  Passive  Receive  Outboard  Electronics  (OBE).  The 
WAA  Receiver  receives,  demultiplexes,  and  reformats 
data  signals  for  output  to  the  WAA  Beamfonner. 
Required  DC  power  and  control  signals  for  the  WAA 
OBE  are  carried  down  twisted  shielded  pairs  of  wire  from 
the  WAA  Receiver. 

Firmware  resident  in  the  WAA  Receiver  provides 
control  and  performance  monitoring  of  data  that  is  input 
from  the  WAA  OBE  and  formats  the  data  for  output  to 
the  WAA  Beamfoimer.  The  firmware  receives  OBE 
controls  and  parameters  from  other  system  software  and 
formats  these  for  transfer  to  the  OBE.  The  firmware 
includes  control  of  a  Pseudo  Random  Noise  Generation 
(PRNG)/C;al  generation  function  which  can  generate  a 
single  bit  of  Pseudo  Random  Noise  (PRN)  or  calibration 
signal  to  be  passed  to  the  OBE  for  generation  of  a  test 
signal.  The  firmware  performs  a  gain  calculation  to  be 
used  for  OBE  gain  control.  The  firmware  also  performs 
Digital  ACINT  Data  Acquisition  System 
(DADAS)/ACINT  Clalibration  injection  upon  command. 

The  WAA  Beamformer  cabinet  receives  WAA 
hydrophone  data  from  the  WAA  Receiver,  provides 
required  acquisition  and  track  beamforming;  implements  a 
post  beamformer  correlation  function;  provides  selectable 
beam  data  output  for  reaxding,  audio,  and  calibration; 
performs  integration  and  requantization  of  coirelogram  for 
display;  provides  post  beamforming  AOBT  signal 
injection,  and  provides  resources  for  tracking, 
localization,  performance  monitoring  and  unit  control. 

The  resident  firmware  receives  DIMUS  hydrophone 
data  and  linear  hydrophone  data  and  the  necessary  controls 
to  manage  the  track  and  acquisition  beamforming  and 
signal  processing. 

The  acquisition  processing  provides  DIMUS 
beamforming;  post  beamformer  AOBT  signal  injection; 
post  beamformer  filtering  and  correlation  processing; 
selectable  digital  beam  output  for  recording;  stabilization, 
integration,  normalization,  and  requantization  of 
correlation  data  for  display;  PM/FL  and  processing 
control. 
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BCM  ASICS 

68000 

56001 

TMS320C30 

TOTAL 

@25  MIPS 

@.75  MIPS 

@20  MIPS 

@25  MFLOPS 

MIPS/MR.OPS 

BCM 

900 

900/0 

GCM 

9 

9/0 

CFM 

300 

300/0 

SPM 

3120 

3120/0 

FPM 

1400 

0/1400 

GCA 

300 

300/0 

100 

TOTAL 

9 

3820 

1400 

4729/1400 

Tabl« 

1.  MIPS/MFLOPS 

Calculations 

for  Unit  1307 

The  trade  processing  consists  of  linear  beainfonning; 
post  beamfocmer  filtering  and  correlation  processing; 
DEMON  classification  processing;  selectable  digital 
beam  and  hydtopbone  data  output  for  recording  and  audio; 
stabilization,  integration,  normalization,  and 
requantization  of  correlation  and  DEMON  data  for 
display;  noise  estimation;  PM/FL;  and  processing 
control 

The  WAA  processor  supports  functional  processing 
for  acoustic  data.  On*Board  Training.  Input/Output  (I/O) 
processing.  Database  Management  processing,  disk 
storage  and  Workstation  processing.  The  unit 
functionality  is  not  provided  by  firmware  but  by 
application  software  executing  in  processor  nodes 
connected  by  a  LAN  called  FLEXNET.  FLEXNET 
suiqpotts  data  transfer  and  communication  between  nodes 
within  the  unit. 

2.2:  Functional 

The  functional  design  view  of  the  WAA  subsystem 
is  represented  by  characterizing  system  throughput  and 
resource  utilization.  Infcmnadon  describing  the  functional 
partition  is  also  presented.  The  information  presented  in 
this  view  provides  insight  into  the  computational 
allocation  of  the  baseline  system.  When  beginning  to 
develop  a  strategy  for  allocation  of  processing  resources 
for  a  system  upgrade  it  is  helpful  to  first  examine  the 
baseline  system  allocation  as  a  starting  point.  By 
characterizing  the  throughput  of  existing  system  the 
designer  gains  some  insight  into  the  overall  processing 
requirements  of  the  system  as  well  as  specific  functional 
partitions. 

The  system  throughput  is  represented  in  MIPs  and 
MFLOPs.  The  example  unit  calculation  listed  in  Table  I 
for  unit  1307  are  based  on  analysis  of  the  number  and 
types  of  processors  resident  in  the  unit^.  The  modules 
which  contain  processors  are  listed  on  the  left  side  of  the 


tabte.  The  various  processors  resident  on  the  modules  are 
listed  on  the  top  of  the  table  along  with  the  specific 
MIP/MFLOP  rating  for  each  module  type.  Modules  based 
on  two  microprocessors,  two  Digital  Signal  Processing 
(DSP)  chips  and  one  Application  Specific  Integrated 
Circuit  (ASIC)  provide  the  system  processing.  The  DSPs 
are  the  Motorola  56001  and  Texas  Instruments 
TMS320C30.  The  microprocessors  are  the  Motorola 
68000  and  68030.  The  ASIC  is  a  special  purpose  chip 
called  the  Beamformer  Computational  ASIC. 

For  the  throughput  characterizations  the  processor 
ratings  are  estimated  to  be  2  MIPs  for  the  68030.  0.75 
MIP  for  the  68000,  20  MIPs  for  the  56001,  25 
MFLOPS  for  the  TMS32C30  and  25  MIPs  for  the  B(>4. 
The  ratings  for  the  ASIC  and  DSPs  are  estimates  and  are 
based  on  published  benchmarks,  clock  speed  and  have 
been  derail  to  estimate  actual  sustained  throughput  rather 
than  peak  throughput.  The  MIP  rating  for  the  68030  is 
based  on  ^lecified  CPU  module  performance. 

In  unit  1306,  a  combination  of  68000  and  56001 
provide  a  total  of  486  MIPs.  The  56001  support  the 
demultiplexing  of  data  received  from  the  OBE.  The 
68000  support  unit  control  functions. 

Table  I  shows  that  a  combination  of  all  processor 
types  contribute  to  the  total  1307  unit  throughput 
capacity  of  4729  MIPs  and  1400  MHops.  The  primary 
functions  performed  in  this  unit  are  beamforming, 
detection  and  tracking.  The  majority  of  the  processing  is 
performed  by  the  DSPs  and  the  ASIC.  Again,  the  68000 
supports  unit  control  functions  and  does  not  contribute 
significantly  to  the  overall  unit  computational  crqracity. 

Although  a  table  is  not  presented  for  unit  1308,  the 
processing  is  provided  entirely  by  68030  processors  for  a 
total  unit  throughput  capacity  of  95.2  MIPs^.  The  68030 
resident  on  the  CPU  module  and  the  NSU  module  which 
performs  FLEXNET  processing  have  measured 
throughput  of  2  MIPs.  Other  68030  are  resident  on  I/O 
modules  (FST,  NTDSE,  SCSI,  SIOC,  CSIO)  and  have 
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Processors/Function 


Figur*  1.  Functional  and 

measured  throughput  of  1.6  MIPs. 

The  pie  chaits  in  Figure  1  provide  an  overall  system 
summary  which  illustrates  how  the  throughput  is 
partitioned  between  system  functions  and  processors.  The 
figure  shows  that  the  DSPs  form  the  majority  both  in 
number  of  processors  and  in  throughput  In  particular  the 
56001  provides  55%  of  the  number  of  processors  in  the 
system  and  64%  of  the  total  system  throughput. 
Functionally,  the  majority  of  processors  and  throughput 
are  allocated  to  Beamforming  and  Detection.  As  seen  in 
Hgure  1,  23%  of  the  total  number  of  processors  and  30% 
of  the  system  throughput  are  allocated  to  Beamforming. 

For  E)etection  31%  of  the  processors  and  38%  of  the 
throughput  are  allocated  to  this  function. 

While  the  throughput  data  presented  above  illustrates 
the  available  computational  capacity  of  the  system  and 
some  indication  of  the  partitioning,  it  does  not  show  how 
much  of  the  capacity  is  being  utilized  by  the  application 
software.  Table  II  contains  resource  utilization  for  the 
A3  drawer  in  unit  1307'*.  Both  processor  and  memory 
utilization  as  well  as  available  memory  metrics  are  listed 
for  each  individual  processor  or  processor  group.  Because 
the  available  memory  and  memory  utilization  in  unit 
1307  are  generally  relatively  low,  the  data  indicates  that 
the  resident  functions  are  more  processing  intensive  than 
memory  intensive.  With  a  few  exceptions  the  majority  of 
the  processors  are  below  the  system  TADSTAND 


Throughput/Function 


Processor  Partitioning 

requirement  of  less  than  80%  processor  and  memory 
utilization. 

23:  Implementational 

The  implementational  design  view  of  the  WAA 
subsystem  is  represented  by  describing  each  hardware 
unit.  The  description  will  include  the  drawer  types, 
module  populations,  and  finally  overall  unit  metrics  for 
power,  weight,  volume  and  available  space^.  The 
importance  of  this  view  is  that  it  provides  a  detailed 
characterization  of  the  physical  aspects  of  the  baseline 
hardware. 

Three  units  comprise  the  WAA  subsystem.  All  the 
units  are  nine  drawer  CEEE  populated  by  SEM-D  format 
modules.  The  units  are  water  cooled  and  are  powered  by 
155  VDC.  Each  unit  contains  one  Power  Control  drawer 
which  provides  EMI  filtering,  unit  and  drawer  controls 
and  status  monitoring  and  distribution  of  155  VIX! 
power.  Unit  1306  and  1307  each  contain  an  Interface  and 
Control  drawer  which  acts  as  a  unit  controller  for 
synchronization  and  timing  control. 

As  described  earlier,  unit  1306  contains  nine  drawers. 
In  addition  to  the  power  drawer(A2)  and  the  interface  and 
control  drawer! A5)  ,  Unit  1306  contains  six 
demultiplexor  drawers(Al,  A3,  A4,  A6,  A7,  A9)  and  one 
unpopulated  drawer(A8).  Each  drawer,  except  for  A2, 
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DRAWER  #  PROC 

MODULE 

TYEE 

PROC 

UTILiaJ 

MEM(RAM) 

UT1L(%) 

MEM(RAM) 

A-VAILIKB) 

A3 

CHANNEL  BEAMFORMER 

18 

BCM 

68.0 

37.0 

32 

TIME  DELAY  CALC. 

3 

GCA 

5.0 

13.0 

96 

CHANNEL  CONTROLLER 

3 

GCA 

45.0 

21.0 

96 

DRAWER  CONTROLLER 

1 

GCM 

45.0 

20.0 

128 

FILTERING 

6 

SPM 

79.0 

7.0 

192 

SPECIAL  OUTPUT  COLLECT 

3 

SPM 

65.0 

10.0 

192 

SPECIAL  OUTPUT 

1 

EFM 

89.0 

10.0 

192 

FULLBEAM  CORRELATOR 

6 

SPM 

71.0 

78.0 

192 

HALFBEAM  CORRELATOR 

6 

SPM 

78.0 

78.0 

192 

DEMON 

' 

SPM 

37.0 

94.0 

192 

ACOUS.  FORM;  T/OUTPUT 

3 

CFM 

15.0 

10.0 

384 

Table  II.  Unit  1307  Resource  Utilization 

coDtains  66  module  slots.  The  module  pc^lations  of  the  SPM  modules  which  contain  2  56001  DSP,  perfonn  the 

demuldidexor  drawers  are  identical.  GC^  modules  which  majority  of  the  demultiplexing.  Unit  1306  contains  192 

contain  68000  processors,  perform  control  functions.  The  WPR  modules  which  provide  E)C  power  to  the  OBE. 


A1 

A2 

M 

A2 

M 

AZ 

AS 

A2 

TOTAL 

BCM 

18 

18 

36 

GCM 

1 

1 

1 

7 

1 

1 

12 

CFM 

3 

3 

3 

3 

3 

15 

CTM 

1 

1 

1 

3 

1 

1 

8 

DRM 

3 

6 

3 

6 

3 

21 

DDM 

3 

3 

SPM 

18 

12 

18 

12 

18 

78 

PSM 

1 

1 

1 

2 

1 

1 

7 

CDM 

3 

3 

3 

7 

3 

3 

22 

FPM 

28 

28 

ROM 

6 

6 

WFM 

3 

3 

3 

9 

WCM 

9 

9 

9 

27 

GCA 

3 

3 

3 

3 

3 

15 

DRD 

1 

1 

1 

1 

1 

5 

CIM 

5 

5 

EFM 

1 

1 

1 

1 

1 

5 

DAC 

2 

2 

4 

KLM 

1 

1 

2 

TOTAL 

47 

0 

53 

47 

61 

53 

47 

0 

0 

308 

Table  III.  Unit  1307  Module  Population 
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Table  IV.  Summary  of  Unit  Metrics 


The  module  population  of  unit  1307  is  listed  in 
Table  m^.  In  addition  to  the  power  (A2)  and  interface  and 
control  (A5)  drawers,  this  unit  contains  three  identical 
Acquisition/Beamformer  drawers  (Al,  A4,  A7)  and  two 
identical  Track/Beamformer  drawers  (A3,  A6).  Eirawers 
A8  and  A9  are  empty.  The  primary  computational 
modules  contained  in  the  unit  are  the  BCM  (ASIC), 
(jCM  (68000),  CFM  (56001),  SPM  (2/56001),  FPM 
(2/TMS320C30),  GCA  (68000/56001)  and  EFM 
(56001).  The  WCM  module  performs  1  bit  DIMUS 
processing. 

Unit  1308  contains  two  Data  Management  drawers 
(A3,  A4),  a  disk  drawer  (A6)  and  an  EMSP  I/O  drawer 
(Al)  as  well  as  the  power  control  drawer  (A2).  Drawers 
A5,  A7,  A8,  A9  are  empty.  The  modules  in  the 
functional  drawers  (Al,  A3,  A4)  are  configured  in 
processing  nodes  to  perform  specific  functions.  Each  node 
contains  1  CPU,  1  or  2  GM16,  1  NSU  and  various 
interface  cards  (FST,  NTDSE,  SCSI,  CSIO). 


Table  IV  summarizes  several  unit  metrics  for  the 
WAA  subsystem.  The  percentages  of  spare  slots  for  each 
unit  was  computed  by  dividing  the  number  of  occupied 
module  slots  by  the  number  of  available  module  slots. 
All  the  units  contain  available  space  for  additional 
modules.  The  processing  density  for  each  unit  was 
computed  by  dividing  the  unit  MIPS  by  the  volume  and 
indicates  that  the  majority  of  the  system  processing  is 
performed  in  unit  1307. 

3:  System  Design  Factors 

One  of  the  primary  objectives  of  System  Design 
Factors  is  "  to  provide  a  mechanism  to  quantify  and 
identify  a  large,  complex  real-time  system's  strengths  and 
weaknesses  so  that  effective  comparison  of  different 
systems  is  achievable"^.  By  defining  a  set  of  SDF, 
specific  system  characteristics  are  more  easily  compared. 
The  information  is  presented  as  a  hierarchy  in  which  each 
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im 

1307 

1308 

MFLOPS 

0 

1400 

0 

Throughput 

MIPS 

486 

4729 

95 

I.  Performance 

Utilization 

percent  (%) 

43 

50 

34 

Power 

kilowatts  (kW) 

5.9 

3.6 

2 

II.  Physical 

Weight 

pounds  (lbs) 

1120 

1240 

1060 

Volume 

cubic  feet  (ft^) 

24 

24 

24 

III.  Future  Needs 

Expandability 

spare  slots  (%) 

38 

41 

71 

Table  V.  System  Design  Factors 
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level  provides  increasingly  specific  descriptions  of  the 
properties  listed  in  the  left  hand  column  of  the  table. 
Table  V  contains  a  list  of  SDF  defined  for  the  WAA 
subsystem.  The  properties  listed  in  Table  9  are  only  a 
small  subset  of  the  SDF  identified  by  Nguyen  and 
Howell^. 

4:  Summary 

This  paper  provides  a  characterization  of  the  WAA 
subsystem.  The  information  is  presented  using  two 
system  defmilion  approaches  developed  by  researchers  at 
NSWC.  The  tqiproaches  are  the  System  Design  Views 
and  the  System  Design  Factors  methodologies. 

The  System  Design  Views  approach  uses  S  unique 
design  views  to  describe  the  system.  They  are  the 
behaviwal,  environmental,  functional,  informational  and 
implementational  view.  Each  view  provides  specific 
information  which  can  be  used  separately  to  define 
specific  system  characteristics.  When  taken  together,  the 
set  of  views  provide  a  comprehensive  description  of  the 
system. 

The  System  Design  Factors  approach  involves 
creating  a  hierarchy  of  system  properties.  Each  successive 
layer  of  the  hierarchy  provides  additional  detail  to  define 
the  property.  By  presenting  the  information  in  a 
hierarchical  format,  it  allows  the  designer  to  compare  and 
contrast  the  relevant  properties  of  various  systems. 

System  definition  methodologies  such  as  System 
Design  Views  and  System  Design  Factors  are  being 
developed  to  improve  complex  system  engineering.  The 


goal  of  these  methodologies  is  to  improve  the  system 
definition  early  in  the  design  cycle.  The  hope  is  that  these 
methodologies  will  aid  the  system  designer  in  the 
conceptualization  and  development  of  complex  systems. 

When  considering  system  upgrades  or  improvements 
it  is  important  to  evaluate  characteristics  of  the  new 
system  against  the  existing  system.  This  report  has 
attempted  to  characterize  the  WAA  subsystem  by 
applying  currently  evolving  system  definition 
methodologies  to  present  the  pertinent  information.  The 
information  presented  will  be  used  to  perform  trade  off 
analysis  and  evaluate  relevant  aspects  of  the  proposed 
system  upgrade. 
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Translating  CMS-2  to  Ada 

Charles  H.  Sampson 
Computer  Sciences  Corporation 


Abstract 

The  goal  of  this  paper  is  a  description  of  TRADA,  a 
CMS-2  to  Ada  translator  that  is  being  developed  by  Com¬ 
puter  Sciences  Corporation  under  the  auspices  of  the 
Naval  Command,  Control  and  Ocean  Surveillance  Center 
(NCCOSC)  Research,  Development,  Test  and  Evaluation 
Division  (NRaD)  in  San  Diego. 

This  paper  begins  with  a  discussion  of  the  translation 
philosophy  behind  TRADA  and  an  overview  of  the  trans¬ 
lator.  Then  follow  three  sections  that  sketch  CMS-2. 
Those  familiar  with  CMS-2  might  be  tempted  to  skip 
these  sections,  but  they  do  contain  some  subtle  points  that 
a  user  might  be  unaware  of.  A  discussion  of  CMS-2 
features  that  cause  translation  problems  and  several  of  the 
translation  strategies  used  in  TRADA  are  next.  The  paper 
concludes  with  a  brief  discussion  of  the  one  known  case 
of  inadequate  translation  and  a  summary. 

1.  Philosophy  of  Translation 

What  should  be  the  goal  of  a  CMS-2  to  Ada  translator? 
There  is  no  single  answer  to  that  question.  From  the 
user’s  viewpoint  a  translator  (from  any  language  to  any 
other)  should  ideally  translate  all  code  with  100%  accura¬ 
cy;  if  the  ideal  cannot  be  achieved  then  at  least  it  should 
translate  the  hard  parts  that  the  user  doesn’t  want  to  do 
himself. 

In  practice,  it  is  often  impossible  to  achieve  either  of 
these  goals,  particularly  when  the  language  being  translat¬ 
ed  is  as  undisciplined  as  CMS-2.  It  is  often  precisely  the 
parts  that  the  user  finds  hard  that  are  very  difficult  or 
inqxissible  for  a  translator.  Nonetheless,  a  translator  can 
have  a  significant  impact  in  transitioning  a  project  from 
CMS-2  to  Ada.  It  can  take  care  of  many  of  the  mind- 
numbing  details,  leaving  the  humans  involved  free  to 
devote  their  energies  to  important  parts  of  the  task. 

The  design  and  implementation  of  TRADA  is  driven 
by  three  goals;  to  produce  translated  code  that  is  trans¬ 
portable  and  at  least  as  easy  to  maintain  as  the  original 
CMS-2;  to  translate  with  100%  accuracy  when  translation 
is  possible;  and  to  clearly  indicate  the  problem  when 
translation  is  not  possible.  Even  these  goals  are  not  al¬ 
ways  attainable,  but  the  cases  when  they  are  not  attained 


are  expected  to  be  very  rare  or,  in  the  case  of  inaccurate 
translation,  detectable  by  some  other  means. 

The  desire  for  transportable,  maintainable,  Ada  code 
comes  from  the  belief  that  there  is  no  particular  virtue  in 
exercising  an  Ada  compiler.  That  is,  the  primary,  per¬ 
haps  unstated,  purpose  of  the  translation  is  to  retq)  the 
benefits  of  the  Ada  language  and  two  of  those  benefits  are 
transportability  and  maintainability.  While  there  might  be 
some  argument  on  the  issue  of  transportability,  particular¬ 
ly  for  a  specific  system  targeted  to  a  particular  Navy 
computer,  it  is  hard  to  justify  ever  producing  code  that  is 
harder  to  maintain  than  the  CMS-2  original,  as  has  been 
seen  from  time  to  time. 


Fully  accurate  translation  is  required  because  of  the 
carefully  tuned  algorithms  in  much  deployed  CMS-2  code, 
particularly  fixed-point  calculations.  If  the  translation 
output  inefficiently  but  accurately  captures  the  algorithm, 
the  maintenance  programmer  can  look  at  the  Ada  code 
and  decide  how  to  improve  the  efficiency.  This  is  a  much 
easier  process  than  attempting  to  determine  why  a  previ¬ 
ously  correct  algorithm  is  no  longer  producing  correct 
results. 

Clear  indication  of  translation  problems  is  a  counter¬ 
part  of  the  desire  for  fully  accurate  translation.  When  an 
accurate  translation  is  not  possible,  it  is  necessary  to  call 
out  to  the  maintenance  programmer  that  intervention  is 
required. 

2.  TRADA  Overview 

TRADA  is  viewed  as  a  one-shot  tool.  That  is,  any 
CMS-2  program  is  expected  to  be  translated  only  once. 
After  that,  any  modifications  will  be  made  to  the  output 
Ada  code.  Because  of  this  one-shot  aspect,  the  emphasis 
in  TRADA  has  been  on  the  translation  process,  with 
minimal  regard  for  execution  speed.  If  a  program  is  only 
going  to  be  used  once  per  project  and  only  a  few  times  in 
a  single  shop,  very  long  execution  times  can  be  accepted. 

It  has  been  suggested  that  some  might  want  to  translate 
using  TRADA,  investigate  the  translation  output,  modify 
the  CMS-2  source,  translate  again,  and  repeat  this  process 
through  several  cycles.  This  approach  seems  to  be  based 
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on  the  belief  that  a  slight  tweak  of  the  CMS-2  would 
result  in  successfully  translating  many  previously  untrans¬ 
latable  statements.  While  this  might  be  true,  in  these 
times  of  limited  resources  it  was  decided  to  carefully 
document  the  kind  of  changes  that  should  be  made  to  the 
code  before  translation  and  reduce  costs  by  implementing 
with  minimum  consideration  of  throughput. 

The  implementation  of  TRADA  is  in  progress  as  this 
paper  is  being  written.  Rather  than  freezing  the  descrip¬ 
tion  of  TRADA  at  this  time,  it  is  described  as  though  it  is 
complete,  with  all  specified  features  implemented. 

TRADA  is  being  developed  on  a  VAX/VMS,  using 
DECAda.  It  has  been  designed  to  simplify  transporting 
to  other  hosts.  In  particular,  its  design  does  not  take 
advantage  of  the  virtual  memory  of  the  VAX;  all  major 
data  structures  have  a  spill-to-disk  capability.  (However, 
TRADA  is  not  a  sequence  of  independently  loaded  phas¬ 
es.  Therefore  a  reasonable  amount  of  memory  will  be 
required  to  hold  the  executing  code.) 

In  the  following  subsections  a  number  of  terms  are 
used  that  might  be  unfamiliar  to  those  who  do  not  know 
CMS-2.  They  are  explained  in  section  3. 

2.1  Inputs 

TRADA  is  an  unusual  translator  in  that  it  only  trans¬ 
lates  entire  programs.  Certain  of  its  translation  decisions 
require  knowledge  of  all  uses  of  the  item  being  translated. 
Therefore,  translation  of  less  than  an  entire  program 
would  conflict  with  the  goal  of  fully  accurate  translation. 

As  a  result,  one  required  input  to  TRADA  is  the  entire 
source  of  the  program  to  be  translated.  TRADA  does 
support  the  include  capability  of  the  MTASS  CMS-2 
compilation  system,  so  it  is  not  necessary  to  gather  all  of 
the  source  into  a  single  file.  On  the  other  hand,  TRADA 
does  not  support  compools,  so  the  source  of  the  system 
data  blocks  that  make  up  compools  must  be  presented  as 
part  of  the  source  to  be  translated.  (Compools  cannot  be 
used  in  TRADA  because  of  the  need  to  see  all  uses  of 
items  before  their  translation  can  be  determined.  Even  if 
this  were  not  an  issue,  compools  are  not  cost-effective 
because  they  would  be  used  only  once.  Thus  the  cost  of 
implementing  compools  in  TRADA  would  not  be  paid 
back  as  it  is  for  a  CMS-2  compiler  by  the  reduction  in 
resource  usage  gained  when  they  are  input  to  multiple 
compilations.) 

The  presented  source  program  must  also  be  correct 
CMS-2;  if  any  syntax  or  semantic  error  is  found,  no 


translation  is  produced.  The  potential  for  chaos  that  could 
result  from  translating  an  incorrect  CMS-2  program  is  too 
great.  The  requirement  of  a  complete,  correct,  source 
program  means  that  the  user  must  fix  the  problems  dis¬ 
cussed  in  sections  5.2  and  S.3.  Without  these  fixes, 
proper  translation  is  impossible. 

TRADA  must  also  be  told  the  name  of  the  main  proce¬ 
dure.  This  information  is  used  solely  to  generate  the  Ada 
main  procedure,  which  does  nothing  but  call  the  transla¬ 
tion  of  the  CMS-2  main  procedure. 

TRADA  optionally  accepts  the  input  of  a  script  file, 
which  directs  certain  aspects  of  the  translation.  Examples 
of  the  kind  of  thing  controllable  through  the  script  file  are 
whether  the  use  clause  or  fully  qualified  names  should  be 
used,  whether  fixed-point  conversions  should  be  by  trun¬ 
cation  or  left  to  the  Ada  fixed-point  accuracy  specifica¬ 
tions,  and  whether  between-statement  comments  are  at¬ 
tached  to  the  preceding  construct  or  the  following  one. 

2.2  Outputs 

TRADA  creates  one  package  corresponding  to  each 
CMS-2  module  of  the  program  being  translated.  For 
system  data  blocks,  the  package  usually  consists  of  a 
specification  only,  although  there  are  cases  in  which  the 
package  will  have  a  body.  For  system  procedure  blocks 
the  package  specification  contains  the  declarations  of  the 
module’s  entry  subprograms  and  any  other  exported  dec¬ 
larations.  Each  of  these  compilation  units  is  created  in  a 
separate  file. 

The  output  also  contains  a  package  (specification  only) 
corresponding  to  the  major  header  and  a  "main"  proce¬ 
dure  that  calls  the  translation  of  the  CMS-2  program’s 
startup  procedure. 

In  addition,  a  number  of  packages  are  created  that  do 
not  correspond  to  any  single  construct  of  the  program 
being  translated.  One  of  these  contains  the  Ada  transla¬ 
tions  of  all  of  the  CMS-2  types  used,  along  with  a  number 
of  functions  needed  to  support  those  types  (mostly  conver¬ 
sion  functions  that  truncate).  This  package  is  tailored  to 
the  specific  translation;  there  are  too  many  valid  CMS-2 
fixed-point  types  (32,130)  to  use  a  canned  package  that 
contains  translations  of  all  of  them.  To  minimize  the 
number  of  explicit  conversions  needed  in  translated  ex¬ 
pressions,  this  package  makes  heavy  use  of  subtypes.  (As 
an  aside,  the  ranges  and  deltas  used  to  define  these  types 
are  expressed  in  octal,  which  corresponds  clearly  to  the 
way  the  types  are  expressed  in  CMS-2.) 
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The  specification  of  another  generated  package  contains 
the  translations  of  everything  whose  definition  was  incom¬ 
plete.  If  any  of  those  items  are  subprograms,  the  body  of 
this  package  is  generated  to  contain  their  stubbed  defini¬ 
tions.  The  existence  of  this  package  is  a  warning  that  the 
requirement  of  a  complete  program  has  not  been  met.  It 
does,  however,  permit  the  translated  output  to  be  com¬ 
piled  and  linked.  The  careful  user  of  TRADA  might  be 
able  to  flesh  out  this  package  and  get  working  Ada  code, 
rather  than  fixing  the  CMS-2  and  retranslating. 

Other  generated  packages,  generated  as  needed,  contain 
Ada  versions  of  subprograms  predefined  in  CMS-2  and 
code  to  simulate  the  hardware  switches  of  the  Navy  com¬ 
puters.  The  latter  will  ultimately  be  replaced,  of  course, 
but  it  is  useful  in  the  initial  post-translation  stages  when 
the  translated  code  is  being  verified,  augmented,  and 
cleaned  up. 

23  Assumptions 

TRADA  assumes  very  little  about  the  Ada  compiler  to 
be  used  on  the  translated  code.  It  assumes  that  the  com¬ 
piler  supports  a  32-bit  integer,  as  suggested  by  the  Uni¬ 
formity  Rapporteur  Group,  but  does  not  assume  that  it  is 
Standard.Integer.  It  implicitly  assumes  that  the  compiler 
has  enough  capacity  to  compile  the  translated  code.  It 
assumes  that  binary  fixed-point  deltas  are  supported  for 
some  reasonable  set  of  values.  (I.e.,  it  will  properly 
conqiile  statements  of  the  form 

FOR  Xyz' Small  USE  2.0  **  N; 
for  a  reasonable  range  of  values  of  N.) 

3.  Description  of  CMS-2 

CMS-2  is  a  closely  related  family  of  languages.  By 
standards  of  the  1990’s,  these  languages  lie  at  the  low  end 
of  the  spectrum  of  high  order  languages.  Indeed,  when 
I  first  became  familiar  with  CMS-2  almost  20  years  ago, 
I  characterized  them  as  ’disguised  assembly  langimges”. 
I  have  had  no  reason  to  change  that  characterization  since. 
Of  the  well-known  languages,  the  CMS-2  dialects  are 
probably  closest  to  C  in  strengths  and  weaknesses,  al¬ 
though  certainly  not  in  syntax. 

In  this  paper  I  use  the  term  CMS-2  as  though  it  were 
a  single  language  rather  than  a  family.  When  1  write 
about  a  particular  dialect,  I  make  that  clear. 

CMS-2  uses  a  bit  of  non-standard  terminology,  both  in 
its  keywords  and  in  the  description  of  features.  To  mini¬ 


mize  the  jolt  to  the  sensibilities  of  the  uninitiated,  1  will 
describe  CMS-2  using  Ada  terminology  as  much  as  possi¬ 
ble.  For  the  benefit  of  those  who  know  CMS-2,  and  who 
might  be  confused  by  what  they  consider  non-standard 
terminology,  1  will  put  the  corresponding  CMS-2  term  in 
parentheses. 

One  Ada  term  that  will  be  used  in  the  following  has 
not  yet  received  wide  acceptance.  It  is  regional  data, 
meaning  data  declared  in  a  package  body  but  outside  any 
subunit  of  the  package. 

3.1  Gross  Program  Structure 

There  are  two  structures  for  modularization  in  CMS-2: 
the  system  data  block  and  the  system  procedure  block, 
commonly  called  SYS-DD  and  SYS-PROC  respectively, 
after  their  defining  keywords.  (CMS-2  keywords  often 
contain  embedded  hyphens.  Unlike  COBOL,  when  the 
hyphen  is  used  as  a  minus  sign  it  is  not  necessary  to  set 
it  off  from  identifiers  in  any  fashion.  This  is  only  one  of 
the  unusual  problems  imposed  on  compiler  writers  by  the 
language.)  The  system  data  block  is  a  module  that  con¬ 
tains  declarations  of  global  data.  A  system  procedure 
block  contains  executing  code,  along  with  data  whose 
scope  is  global,  local  to  that  module,  or  local  to  a  subpro¬ 
gram  within  that  module.  Each  of  these  oKxlules  can  be 
preceded  by  a  minor  header,  which  essentially  parameter¬ 
izes  the  module.  The  purpose  of  a  system  procedure 
block  appears  to  have  been  to  afford  a  mechanism  foi 
encapsulating  a  subprogram  of  the  system  along  with  any 
auxiliary  subprograms  needed  to  assist  it  in  carrying  out 
its  function.  In  practice,  it  is  often  used  as  a  simple 
collection  of  more  or  less  related  subprograms. 

A  CMS-2  program  consists  of  a  number  of  these  mod¬ 
ules.  There  is  no  CMS-2  construct  for  indicating  the 
"main  program*.  The  means  for  transferring  control 
from  the  operating  system  to  the  program  depends  on  that 
operating  system,  which  is  often  special-purpose. 

A  compilation  (compile-time  system)  consists  of  the 
source  code  for  one  or  more  modules,  preceded  by  a 
major  header,  which  conceptually  is  used  to  parameterize 
the  entire  program.  A  compilation  can  also  make  use  of 
compools,  which  are  previously  compiled  system  data 
blocks.  As  far  as  the  semantics  of  a  compilation  are 
concerned,  the  use  of  a  compool  has  the  same  effect  as 
including  the  source  of  those  system  data  blocks.  What 
is  gained  by  using  compools  is  the  cost  of  repeatedly 
parsing  and  semantic  checking  the  data  blocks.  CMS-2 
compilers  also  typically  support  some  form  of  source  file 
inclusion. 
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Two  distinct  forms  for  source  code  documentation  are 
afforded.  A  comment  is  free-form  non-executable  text 
that  can  only  appear  between  statements  and  declarations. 
A  note  can  appear  between  any  two  tokens  of  the  pro¬ 
gram. 

3.2  Declarations 

The  basic  data  types  of  CMS-2  comprise  integer,  fixed- 
point,  floating-point,  boolean,  fixed  length  string  (charac¬ 
ter  or  Hollerith  type),  and  enumeration  (status  types). 
The  integer  and  fixed-point  types  are  specified  in  terms  of 
the  number  of  bits  required  to  hold  their  values  (size);  for 
fixed-point  the  fractional  part  (scaling)  is  also  specified  in 
number  of  bits.  The  floating-point  types  are  specified  in 
terms  of  the  floating-point  formats  available  on  the  target 
machine.  Enumeration  literals  are  delimited  by  apostro¬ 
phes.  The  actual  enumeration  literal  values  are  thus  not 
restricted  to  be  identifiers;  any  character  from  the  CMS-2 
character  set  can  be  used. 

CMS-2  contains  constructs  for  declaring  typeless  nu¬ 
meric  constants  (ntags)  and  typed  variables,  of  either  the 
basic  data  types  (simple  variables)  or  composite 
types — records  (item-areas  or  structured  variables)  and 
arrays  (tables).  However,  the  components  (fields)  of  a 
record  type  are  restricted  to  be  of  the  basic  types.  It  is 
possible  to  specify  an  initial  value  for  most  objects  (vari¬ 
ables  and  tables).  The  declaration  of  an  array  object 
specifies  how  it  is  to  be  laid  out  in  memory  (similar  to 
row  major  vs.  colunui  major).  This  is  an  execution  effi¬ 
ciency  issue  and  does  not  affect  the  program  semantics, 
although  use  of  one  of  the  forms  does  impose  some  re¬ 
strictions. 

Only  very  late  in  the  evolution  of  CMS-2  did  it  obtain 
an  explicit  type  declaration.  A  number  of  unusual  syntax¬ 
es  and  concepts  were  used  to  compensate  for  this  lack. 
Specifically,  the  declaration  of  an  array  uses  a  block 
syntax.  Within  that  block,  the  structure  of  the  array’s 
componmts  (items)  is  declared.  Within  that  block  it  is 
also  possible  to  declare  other  arrays  whose  components 
have  the  same  structure  (like-tables),  variables  having  the 
same  structure  (item-areas),  and  sub-arrays  (sububles), 
whose  components  are  a  contiguous  subset  of  components 
of  the  array  being  declared.  Even  with  these  mecha¬ 
nisms,  restrictions  in  the  language  made  it  necessary  to 
replicate  component  declarations  in  different  array  decla¬ 
rations.  This  occurs  quite  often  in  old  CMS-2  code,  in 
spite  of  the  obvious  maintenance  problems  it  engenders. 

A  switch  is  a  CMS-2  construct  not  found  in  many  other 
languages.  A  label  switch  specifies  a  number  of  labels; 


control  is  transferred  to  one  of  those  labels  when  a  partic¬ 
ular  form  of  goto  statement  is  executed.  A  procedure 
switch  specifies  a  number  of  procedures;  one  of  those 
procedures  is  executed  when  the  switch  is  invoked,  with 
a  syntax  very  similar  to  that  of  a  procedure  invocation, 
in  both  cases  the  choice  depends  either  on  the  value  of  an 
integer  expression  (indexed  switch)  or  the  value  of  a 
specified  variable  (item  switch)  at  the  moment  of  execu¬ 
tion/invocation.  Thus,  executing  a  goto  using  an  indexed 
label  switch  has  an  effect  quite  similar  to  FORTRAN’S 
computed  goto. 

It  is  possible  to  elevate  the  scope  of  many  identifiers. 
Identifiers  declared  in  a  system  procedure  block  whose 
scope  would  ordinarily  be  local  to  that  block  can  often  be 
given  global  scope. 

CMS-2  does  not  have  a  project  library  similar  to  that 
of  Ada.  In  order  to  support  building  large  systems  out  of 
a  number  of  small  compilations,  most  of  the  declarations 
can  be  modified  (EXTREFed)  to  indicate  that  the  datum 
is  not  to  be  allocated  as  a  result  of  the  declaration— only 
its  attributes  are  being  specified  in  order  to  give  the  com¬ 
piler  the  information  it  needs  to  generate  the  proper  code 
sequences.  The  code  is  tied  to  the  proper  memory  loca¬ 
tion-allocated  as  a  consequence  of  an  unmodified  decla¬ 
ration  elsewhere— when  the  system  is  linked. 

3.3  Subprograms 

Subprograms  consist  of  procedures  and  functions. 
Procedures  can  optionally  have  input  and  output  parame¬ 
ters.  Functions  can  have  only  input  parameters  and  can 
return  values  of  only  the  basic  types.  It  is  not  possible  to 
nest  subprograms.  Subprograms  are  not  recursive,  but 
they  are  compiled  with  code  that  permits  reentrancy,  if 
the  requisite  executive  support  is  available. 

In  some  of  the  dialects  of  CMS-2,  procedures  can  also 
have  exit  parameters,  similar  to  the  abnormal  exit  of 
FORTRAN.  At  the  point  of  invocation,  the  exit  parame¬ 
ters  are  matched  with  labels.  If  the  invocation  is  termi¬ 
nated  by  executing  an  "abnormar  return,  control  is  trans¬ 
ferred  to  the  corresponding  label.  In  this  case,  the  assign¬ 
ment  of  values  to  actual  output  parameters  is  not  per¬ 
formed.  (See  section  4.3.) 

3.4  Expres.sions 

The  operands  of  numeric  expressions  can  be  of  any 
numeric  type.  The  rules  of  the  language  specify  the 
implicit  type  conversions  that  will  be  performed  in  order 
to  evaluate  a  numeric  expression.  In  particular,  an  ex- 


pression  involving  fixed-point  values  is  evaluated  accord¬ 
ing  to  an  elaborate  set  of  fixed-point  scaling  rules,  which 
are  highly  machine  dependent.  Generally,  these  rules 
prevent  loss  of  significant  data,  specifying  when  values 
are  to  be  shifted  right  internally  to  avoid  overflow. 
(There  are  a  few  exceptions  that  are  intentioiully  in  the 
language!)  When  type  conversion  results  in  the  loss  of 
information,  it  is  by  truncation  rather  than  rounding. 

The  usual  numeric  operations  are  available  for  data  up 
to  32  bits  long  (counting  the  sign  bit).  These  operations 
can  be  written  for  quantities  in  the  33-64  bit  range.  Addi¬ 
tion  and  subtraction  of  such  quantities  are  performed 
efficiently  and  accurately.  When  such  quantities  are 
multiplied  or  divided,  the  operation  is  performed  in  float¬ 
ing-point.  (The  conversion  of  a  64-bit  value  to  floating¬ 
point  is  inexact  in  all  dialects  of  CMS-2.) 

Boolean  expressions  are  usually  evaluated  by  short- 
circuit.  This  is  always  the  case  in  conditional  statements 
and  is  now  the  case  for  some  compilers  in  boolean  assign¬ 
ment  statements.  This  evaluation  is  an  artifact  of  the 
conqiiler  implementations;  it  is  not  specified  as  part  of  the 
language  definition. 

3.5  Statements 

Although  their  syntaxes  can  be  quite  non-standard, 
CMS-2  has  the  expected  collection  of  statements:  assign¬ 
ment,  if-elsif-else,  loops,  case,  goto,  etc.  Many  of  these 
have  unusual  features.  There  are  also  a  few  non-standard 
statements. 

Loops  (vary  statements)  have  the  usual  three  forms: 
FOR-loop  (controlled  by  an  index),  WHILE-loop  (con¬ 
trolled  by  a  boolean  expression  evaluated  at  the  top  of  the 
loop),  and  loop-UNTIL  (controlled  by  a  boolean  expres¬ 
sion  evaluated  at  the  bottom  of  the  loop).  A  loop  index 
can  be  of  either  a  numeric  or  an  enumeration  type. 

One  unusual  feature  of  loops  is  that  numeric  loop 
indexes  are  not  required  to  be  integer;  a  numeric  loop 
index  is  initialized,  incremented  on  each  iteration,  and 
tested  for  equaling  or  exceeding  its  terminal  value,  re¬ 
gardless  of  its  type.  Another  is  that  any  combination  of 
the  loop  controls  can  be  used  in  a  single  loop,  including 
multiple  indexes;  the  loop  terminates  as  soon  as  any  one 
of  its  controls  satisfies  its  termination  condition. 

Related  to  loops  is  the  resume  statement.  When  it  is 
executed  inside  a  loop,  it  terminate.^^  execution  of  the 
current  iteration.  (It  causes  bottom-of-loop  processing  to 
occur.)  It  can  also  be  executed  outside  of  the  loop  it 


specifies.  In  this  case,  the  loop  resumes  execution  at  its 
bottom-of-loop  processing. 

A  special  form  of  loop  (find  statement)  is  supplied  for 
performing  a  linear  search  of  any  array.  The  search 
terminates  when  a  component  satisfying  ^lecified  criteria 
is  found.  The  search  can  be  continued  by  the  resume 
statement. 

The  selector  of  a  case  statement  can  be  of  any  basic 
type.  The  semantics  specify  that  a  search  is  made  for  a 
matching  case  value,  clearly  dangerous  if  the  selector’s 
type  is  an  approximate  type,  such  as  fixed-point  or  float¬ 
ing-point. 

There  is  no  restriction  on  the  relative  locations  of  a 
goto  statement  and  its  label.  Free  transfer  of  control  into 
and  out  of  arbitrary  statement  blocks  is  permitted.  One 
dialect  even  permits  transfer  of  control  across  subprogram 
boundaries. 

As  mentioned  in  3.2,  there  are  forms  of  the  goto  state¬ 
ment  for  executing  a  multi-way  branch  through  a  label 
switch  and  there  are  procedure-call-like  statements  for 
invoking  procedure  switches.  These  statements  also  allow 
specification  of  a  label  to  be  transferred  to  if  the  value  on 
which  the  switching  decision  is  to  be  made  is  invalid 
(invalid  specification). 

4.  Unusual  and  Problematic  Features 

The  above  description  of  CMS-2  concentrated  on  con¬ 
cepts  that  it  shares,  more  or  less,  with  other  high-level 
languages.  It  has  however  a  number  of  features  that  are 
rare,  if  not  unique.  In  preparing  this  paper  I  easily  devel¬ 
oped  a  list  of  48  of  them.  In  the  following,  I  discuss  only 
those  that  give  particular  problems  in  translating,  omitting 
those  that  might  cause  headaches  for  compiler  and  transla¬ 
tor  writers  but  whose  actual  translation  is  simple. 

4.1  Static  Aliasing 

CMS-2  affords  an  astonishing  variety  of  ways  to  stati¬ 
cally  alias  objects.  (By  this  I  mean  causing  two  objects  to 
share  memory.  I  am  not  discussing  dynamic  aliasing, 
such  as  the  aliasing  that  occurs  in  many  languages  when 
a  global  object  is  used  as  an  actual  parameter  of  a  subpro¬ 
gram  invocation,  for  example.) 

First  among  these  is  the  overlay  statement,  which 
specifies  that  the  bit  string  allocated  to  an  object  is  to  also 
be  used  for  the  bit  strings  allocated  to  some  other  objects. 
Notice  that  this  is  much  finer  than  simply  specifying  that 
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two  objects  are  to  use  the  same  memory  location.  As  an 
example,  for  an  18-bit  object  it  is  possible  to  specify  that 
bits  1-6  are  to  be  used  for  another  object  and  bits  9-17  are 
to  be  used  for  a  third.  This  overlaying  can  be  specified 
for  directly  allocatable  objects  (those  that  hold  the  values 
of  variables)  or  for  the  components  of  a  record  type. 

This  effect  can  be  obtained  a  second  way  for  the  com¬ 
ponents  of  a  record  type,  because  CMS-2  allows  user¬ 
packing,  similar  to  Ada’s  record  r^resentation  clause. 
Unlike  the  Ada  construct,  user-packing  permits  overlap¬ 
ping  of  components. 

Finally,  there  is  a  means  for  specifying  the  relative 
allocations  of  variable  objects.  This  relative  allocation  is 
expressed  in  words  of  the  target  machine;  the  programmer 
can  specify  that  XYZ  is  to  be  allocated  23  words  beyond 
the  place  where  ABC  is  allocated.  Notice  that  if  ABC  is 
an  array,  then  XYZ  might  be  allocated  in  the  middle  of 
one  of  ABC’s  components. 

4.2  Bit-Twiddling 

By  bit-twiddling  I  mean  working  with  bit  substrings  of 
values.  The  problem  with  bit-twiddling  is  that  the  lan¬ 
guage-defined  semantics  of  the  substrings  are  incomplete 
at  best.  While  the  language  specifies  which  bits  of  the 
value  are  to  be  extracted  or  replaced,  the  meaning  of 
those  bits  and  the  relation  of  them  to  the  larger  value  is 
known  only  to  the  programmer.  It  is  the  ease  with  which 
one  can  bit-twiddle  in  CMS-2  that  caused  me  to  character¬ 
ize  it  as  a  disguised  assembly  language. 

Overlaying,  either  through  the  overlay  statement  or 
implicitly  through  user-packing,  is  a  heavily  used  means 
for  bit-twiddling.  In  addition,  CMS-2  supplies  a  parame¬ 
terized  operator  or  pseudo-function  (the  BIT  operator  or 
function)  for  accessing  an  anonymous  bit  substring. 
There  is  another  parameterized  operator  or  pseudo-func¬ 
tion  (the  CHAR  operator  or  function)  for  accessing  char¬ 
acter  substrings  within  a  value;  the  value  is  not  required 
to  be  of  character  type. 

It  is  also  possible  to  access  individu:*!  target-computer 
words  of  an  object  (word  reference).  The  particular  word 
accessed  might  be  only  one  of  several  making  up  the 
object  or  it  might  contain  several  components  of  the  ob¬ 
ject.  Again,  the  exact  meaning  of  such  an  access  is 
known  only  by  the  programmer. 

Bit  strings  can  be  manipulated  by  the  conventional  set 
of  bit  operations:  AND,  OR,  NOT,  and  XOR.  Typed 
operands  of  these  operators  become  temporarily  typeless. 


The  result  regains  a  type  dependent  on  the  context  of  the 
operation. 

4.3  Subprogram  Parameter  Passage 

Subprogram  formal  parameters  are  very  unusual  in  that 
rather  than  serving  as  surrogates  for  the  actiuil  parameters 
during  execution  of  an  invocation,  they  are  variables 
external  to  the  subprogram.  At  the  point  of  invocation, 
the  values  of  the  actual  input  parameters  are  copied  to  the 
corresponding  formal  input  parameter  prior  to  transfer  of 
control.  Following  the  invocation,  the  values  of  any 
formal  output  parameters  are  copied  to  the  corresponding 
actual  output  parameters. 

The  semantics  of  parameter  assignment,  coupled  with 
the  parameters  being  variables  external  to  the  subpro¬ 
gram,  can  give  rise  to  cross-parameter  interference:  as¬ 
signing  a  value  to  one  parameter  in  the  list  can  affect  the 
value  assigned  to  another.  An  example  is  a  formal  pa¬ 
rameter  that  is  an  integer  variable,  with  that  variable 
being  used  as  a  subscript  of  a  later  actual  parameter. 
Different  dialects  of  CMS-2  assign  the  formal  parameters 
in  different  orders. 

There  is  no  restriction  on  the  use  of  formal  parameters; 
they  can  be  used  at  any  time  like  ordinary  variables. 
Furthermore,  the  same  variable  can  be  used  as  a  formal 
parameter  of  more  than  one  subprogram. 

4.4  Indirect  Arrays 

There  is  no  heap/pointer  mechanism  in  CMS-2,  but  an 
undisciplined  approximation  is  available  through  the  indi¬ 
rect  array  (indirect  table).  When  an  array  is  declared  to 
be  indirect  it  functions  as  a  surrogate  for  other,  "rear, 
arrays  during  execution.  Thus  it  is  very  much  like  a 
pointer. 

Specifying  a  designated  object  is  accomplished  by 
using  the  CORAD  pseudo-function.  When  a  value  is 
assigned  to  this  pseudo-function  its  argument  can  only  be 
the  name  of  an  indirect  array.  That  value  then  becomes 
the  memory  address  of  the  designated  object  for  subse¬ 
quent  references  to  the  indirect  array. 

When  the  CORAD  pseudo-function  is  used  in  other 
contexts,  its  argument  must  be  a  name  that  has  an  as¬ 
signed  memory  address;  it  generates  that  address  as  its 
value.  Thus  the  statement 

SET  CORAD (XYZ)  TO  CORAD (ABC)  $ 
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means  that  for  subsequent  references,  XYZ  is  to  be  used 
as  a  surrogate  for  ABC. 

There  is  no  requirement  that  the  structure  of  the  desig¬ 
nated  object  be  at  all  related  to  the  structure  specified  in 
the  indirect  array’s  declaration.  It  is  not  even  necessary 
that  the  designated  object  be  an  array.  It  could  be  a 
simple  variable,  the  "tail"  of  an  array  (by  specifying  the 
address  of  one  of  that  array’s  components),  a  switch,  or 
even  code!  In  all  these  cases  of  misnutch  the  indirect 
array’s  structure  rules;  the  desigruted  object  is  accessed 
as  though  its  structure  were  that  of  the  indirect  array. 

One  use  of  indirect  arrays  is  to  effect  call-by-reference 
semantics  in  subprogram  invocation.  Because  the  CMS-2 
parameter  passage  mechanism  is  call-by-value  (section 
4.3),  if  a  formal  parameter  is  an  array  then  the  entire 
array  must  be  copied  as  part  of  the  invocation.  It  is 
possible  to  specify  a  formal  parameter  as  the  address  of 
an  indirect  array  (using  the  CORAD  pseudo-function). 
By  doing  this,  only  the  address  of  the  actual  parameter  is 
copied. 

Because  this  pointer  mechanism  is  specified  as  imple¬ 
mented  using  addresses,  an  executive  could  implement  a 
heiq).  Calls  to  its  allocator  would  simply  return  the  ad¬ 
dress  of  the  newly  allocated  memoiy  area. 

4.5  Array  Initial  Value 

In  specifying  an  initial  value  of  an  array,  it  is  not 
necessary  that  the  array  be  fully  initialized.  The  array’s 
components  must  be  of  a  record  structure  and  the  initial 
value  is  specified  in  terms  of  values  to  be  assigned  to  the 
record’s  components  in  successive  array  components.  If 
fewer  values  are  specified  than  array  components,  only 
the  "first’  array  components  are  initialized;  the  values 
assigned  to  the  remainder  are  not  defined  by  the  language. 
(However,  see  section  5.1.)  This  initialization  can  be 
ragged;  different  numbers  of  values  can  be  specified  for 
different  record  components. 

4.6  Multi-Receptacle  Assignment 

An  assignment  statement  can  contain  multiple  recepta¬ 
cles,  the  objects  that  receive  the  value.  If  the  receptacles 
are  of  different  numeric  types,  implicit  conversion  of  the 
value  being  assigned  is  required.  The  semantics  of  a 
multi-receptacle  assignment  specify  that  the  value  is  as¬ 
signed  to  the  first  receptacle,  converted  if  necessary. 
That  possibly  converted  value  is  then  assigned  to  the 
second  receptacle,  where  another  conversion  might  be 
needed.  This  value  is  then  used  for  the  third  receptacle. 


and  so  on.  While  these  semantics  could  produce  some 
suqirising  results  for  the  unwary,  in  practice  the  over¬ 
whelming  use  of  this  feature  is  to  assign  zero. 

Different  dialects  of  CMS-2  order  the  receptacle  list 
differently,  some  from  left  to  right  and  others  from  right 
to  left. 

4.7  Array  Assignment 

Array  assignment  is  accomplished  by  transferring 
target  machine  words,  without  regard  to  the  structures  of 
the  two  arrays.  If  the  receptacle  array  is  shorter  (in 
words)  it  is  filled  with  the  corresponding  words  from  the 
beginning  of  the  source  array.  If  the  recqitacle  array  is 
longer  only  the  words  at  its  beginning  that  correspond  to 
ike  words  of  source  array  receive  those  new  values. 
Thus,  if  the  arrays  have  identical  structures  the  right  thing 
happens,  and  if  their  components  have  identical  structures 
but  their  sizes  are  different  something  understandable 
happens.  However,  when  their  components  have  different 
structures,  an  insidious  form  of  bit-twiudling  occurs. 

4.8  String  Manipulation 

In  general,  string  assignment  and  string  comparisons 
are  accomplished  through  blank-padding  and  truncation. 
String  literals  are  handled  specially.  If  the  source  of  a 
string  assignment  is  a  literal,  it  is  not  padded  when  too 
short  (the  value  is  sliced  into  the  beginning  of  the  recepta¬ 
cle)  and  it  is  not  permitted  to  be  too  long. 

4.9  Typed  Records 

A  record  type  in  CMS-2  can  be  declared  to  have  an 
associated  basic  data  type.  When  a  variable  of  such  a 
type  is  referenced  as  a  whole,  it  has  the  semantics  of  a 
variable  of  the  associated  type.  As  a  benign  example,  a 
record  type  could  be  declared  to  have  components  corre¬ 
sponding  to  the  characteristic  and  numtissa  of  a  floating¬ 
point  value,  positioned  appropriately,  and  the  floating¬ 
point  type  as  its  associated  type.  For  a  variable  of  this 
type,  the  components  could  be  used  to  create  a  floating¬ 
point  value,  which  could  thereafter  be  used  by  referencing 
the  variable  itself. 

Ada  has  no  corresponding  capability. 

4.10  Boolean  Constants 

The  boolean  constants  true  onA  false  are  not  defmed  in 
CMS-2;  their  functions  are  served  by  the  numeric  literals 
1  and  0,  respectively.  In  practice,  named  number  decla- 


1A9 


rations  are  often  used  to  define  the  identifiers  True  and 
False,  as  well  as  other  convenient  identifiers  with  boolean 
overtones. 

Because  of  the  dual  use  of  0  and  1,  it  is  possible  to 
have  both  boolean  and  numeric  receptacles  in  a  single 
multi-receptacle  assignment  statement. 

4.11  Enumeration  Assignment  and  Comparison 

All  enumeration  types  are  considered  compatible. 
Assignment  and  comparison  of  enumeration  values  is 
based  solely  on  the  compiler-generated  encoding  of  the 
symbolic  values  (0  for  the  first  value,  1  for  the  second, 
etc.) 

4.12  Data  Local  to  Subprograms 

CMS-2  is  not  a  stack-oriented  language;  data  local  to 
subprograms  are  statically  allocated.  Thus  the  values  of 
these  data  at  the  end  of  execution  of  one  invocation  are 
present  at  the  beginning  of  execution  of  the  next  invoca¬ 
tion.  Any  initial  values  given  to  such  data  are  only  the 
values  at  the  beginning  of  execution  of  the  program. 

4.13  Conditional  Compilation 

Conditional  compilation  is  achieved  through  a  blocking 
construct,  where  the  block’s  header  specifies  a  flag 
(CSWITCH)  whose  value  controls  the  compilation  of  the 
code  in  the  block.  The  blocks  can  appear  among  both 
statements  and  declarations. 

If  the  flag  setting  is  such  that  a  block  is  not  to  be 
compiled,  the  code  in  that  block  is  ignored,  other  than 
some  minimal  checking  to  detect  the  end  of  the  block. 
Minimal  checking  has  a  number  of  implications,  principal 
among  them  being  that  the  code  being  checked  does  not 
have  to  be  syntactically  or  semantically  correct.  It  also 
means  that  an  identifier  can  be  declared  in  any  number  of 
blocks,  provided  that  only  one  of  those  blocks  is  com¬ 
piled. 

4.14  Direct  Code 

Assembly  language  statements  can  be  included  in  a 
CMS-2  program  by  enclosing  them  in  a  special  block 
(direct  code  block).  The  programmer  is  given  almost  the 
full  freedom  (license)  available  to  an  assembly  language 
programmer:  executing  instructions  can  be  placed  among 
high-level  data  declarations,  assembly  data  declarations 
can  be  placed  among  high-level  statements,  and  the  in¬ 


structions  and  declarations  can  be  intermixed  in  any  or¬ 
der. 

5.  Usage  of  CMS-2 

In  addition  to  the  official  semantics  of  CMS-2, 
sketched  above,  there  are  facts  about  the  language  that 
might  be  called  “informal  semantics'.  These  facts  are 
known  and  relied  on  by  many  users  of  the  language,  even 
though  they  have  no  official  sanction. 

5.1  Initialization  to  Zero 

All  of  the  CMS-2  linkers  set  all  data  values  to  zero 
unless  they  are  explicitly  initialized  to  some  other  value. 
Some  CMS-2  programs  make  use  of  this  to  avoid  expend¬ 
ing  memory,  which  is  often  limited,  on  code  that  zeros 
some  uninitialized  data. 

5.2  Linking  by  Name 

CMS-2  linkers  link  by  name;  no  semantic  information 
is  passed  from  the  compiler  to  the  linker.  Occasionally 
this  is  “exploited*  in  creating  a  program  from  independent 
compilations,  when  a  name  is  given  different  attributes  in 
two  or  more  of  those  compilations. 

5.3  Reuse  of  Named  Numbers 

The  names  of  named  numbers  are  not  passed  to  the 
linkers  by  the  compilers.  Therefore  it  is  possible— and  it 
does  occasionally  occur— for  “the  same  named  number* 
to  have  different  values  in  two  independent  compilations 
of  a  program. 

6.  Intractable  Translation  Problems 

Certain  of  the  features  of  CMS-2  give  rise  to  transla¬ 
tion  problems  that  cannot  be  solved  by  any  translator,  at 
least  not  when  the  goals  of  the  translation  include  main¬ 
tainability  and  transportability,  as  they  do  for  TRADA. 

6.1  Bit-twiddling 

Any  attempt  to  mechanically  translate  the  various  bit- 
twiddling  features  of  CMS-2  is  doomed  to  fail  for  one  of 
several  reasons.  The  common  thread  to  these  reasons  is 
that  the  exact  meaning  of  the  bit-twiddling  cannot  be 
discerned  from  the  CMS-2  source,  as  noted  previously. 
As  an  example,  suppose  that  a  bit  string  is  being  extracted 
from  some  datum  (typically  a  machine  word)  to  be  used 
in  a  numeric  context  and  that  the  numeric  value  being 
extracted  might  possibly  be  negative.  If  the  extraction  of 
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the  bit  string  is  'correctly'  translated,  but  the  negative 
representation  of  the  target  computer  changes  (the 
AN/UYK-7  family  of  Navy  computers  uses  1  ’s-comple- 
ment,  while  2’s-complement  is  currently  the  most  popular 
representation),  the  meaning  of  the  program  has  been 
changed  by  the  translation.  As  a  second  example,  sup¬ 
pose  that  bit-twiddling  is  being  used  in  a  utility  routine  to 
create  a  floating-point  value  by  building  up  its  component 
parts.  Even  if  the  translated  code  could  mimic  these 
manipulations,  the  result  is  meaningless  if  the  target  ttu- 
chine  for  the  translated  code  does  not  support  the  same 
floating  point  format. 

'Doomed  to  fail'  is  too  harsh  a  statement,  of  course. 
It  might  be  possible  to  specify  some  relatively  well-be¬ 
haved  bit-twiddles  that  could  be  translated.  In  that  case 
the  issue  of  cost-effectiveness  arises:  is  it  worthwhile  to 
design  an  algorithm  to  detect  and  translate  these  cases, 
which  are  expected  to  be  few,  or  would  it  be  better  to 
allocate  resources  to  other,  more  promising  areas,  and 
leave  all  bit-twiddling  for  human  intervention? 

6J,  Overlays 

Periups  the  greatest  single  benefit  that  could  be  ob¬ 
tained  from  a  CMS-2  to  Ada  translator  would  be  the 
analysis  of  the  rat's  nest  of  overlays  that  are  so  typical  of 
CMS-2  code  and  translation  into  clean  Ada.  Unfortunate¬ 
ly,  that  benefit  is  not  to  be  had. 

To  begin  with,  overlays  are  potentially  bit-twiddles,  as 
has  been  noted,  and  suffer  from  all  of  the  problems  of  bit- 
twiddling.  For  example,  the  creation  of  a  floating-point 
value  by  constructing  its  components,  mentioned  above, 
can  be  accomplished  using  overlays. 

There  are  other  uses  of  overlays  that  are  not,  strictly 
speaking,  bit-twiddling.  One  such  is  to  simulate  a  multi¬ 
level  record  structure— records  within  records  within 
records— which  is  not  directly  achievable  in  CMS-2.  An 
algorithm  to  detect  such  a  use  of  overlays  is  possible,  but 
then  a  second  translation  problem  arises.  If  overlays  are 
used  in  CMS-2  to  simulate  a  record  within  a  record,  the 
larger  record  (the  overlay  parent)  has  a  simple  CMS-2 
type,  usually  numeric.  Since  there  is  nothing  correspond¬ 
ing  to  this  in  Ada  (a  record  type  does  not  have  an  associ¬ 
ated  scalar  type),  it  would  be  necessary  to  check  all  uses 
of  the  larger  record  to  see  if  it  is  truly  being  used  as  a 
value  of  that  type  or  if  the  type  is  simply  a  hook  to  hang 
the  simulation  on.  Translation  would  be  possible  in  the 
later  case,  but  again  the  cost-effectiveness  question  must 
be  asked. 


Finally,  overlays  can  be  used  to  create  structures  some¬ 
what  like  an  Ada  record  with  variant  part.  The  key  char¬ 
acteristic  of  this  kind  of  structure  is  that  no  component  is 
ever  read  without  first  being  explicitly  written.  While  this 
is  perhaps  the  most  benign  use  of  overlays,  detecting  it  is 
very  difficult.  At  the  least  an  algorithm  built  on  top  of  a 
global  data  flow  analysis  is  required.  Even  if  such  an 
algorithm  is  implemented,  there  is  no  guarantee  that  a 
particular  structure  of  this  kind  can  be  translated  into  an 
Ada  record  with  variant  parts.  Consider  the  following 
'variant  record':  it  has  a  discriminant  and  three  ordinary 
components;  the  discriminant  can  take  on  values  1 ,  2,  and 
3;  component  1  is  valid  when  the  discrimiiumt  has  value 
1  or  2;  component  2  is  valid  when  the  discriminant  has 
value  1  or  3;  and  component  3  is  valid  when  the  discrimi¬ 
nant  has  value  2  or  3.  This  structure  cannot  be  translated 
into  an  Ada  record  with  variant  part.  The  symbol  table 
of  the  CMS-2Y  compiler  is  this  kind  of  structure;  it  has 
been  carefully  constructed  to  allow  common  components 
to  be  shared  between  almost  random  values  of  the  dis¬ 
criminant. 

6.3  Direct  Code 

There  seems  to  be  a  hope  among  those  interested  in 
using  a  translator  to  assist  them  in  transitioning  from 
CMS-2  to  Ada  that  it  can  figure  out  what  the  direct  code 
is  doing  and  translate  it  into  Ada.  This  simply  cannot  be 
done.  One  example  should  suffice:  how  can  a  translator 
determine  the  meaning  of  an  instruction  that  loads  a  regis¬ 
ter  using  indirection,  where  the  indirect  word  used  is 
constructed  during  execution? 

The  only  approach  that  has  any  hope  of  success  is  to 
simulate  the  direct  code  as  part  of  execution  of  the  trans¬ 
lated  Ada.  Basically,  this  means  outputting— as  part  of 
the  translation— a  simulator  for  the  original  target  ma¬ 
chine.  Even  this  is  not  foolproof,  because  much  direct 
code  depends  on  the  way  the  program’s  data  are  allocated 
by  the  CMS-2  compiler.  A  more  fundamental  challenge 
to  this  approach  is:  what  is  the  point?  Execution  of  an 
ISA  simulator,  albeit  written  in  Ada,  is  not  the  purpose  of 
translating  CMS-2  to  Ada. 

7.  TRADA  Translation  Strategies 

Most  of  the  translations  used  by  TRADA  are  straight¬ 
forward:  procedures  are  translated  into  procedures,  func¬ 
tions  into  functions,  assignment  statements  into  assignment 
statements  (usually),  etc.  There  are  also  a  number  of 
special  translations  used  to  overcome  dissimilarities  of  the 
two  languages. 
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Even  some  of  the  problematic  areas  have  relatively 
straightforward  translations.  String  iiuinipulations  (section 
4.8)  are  handled  through  slicing  and  concatenating  with 
blanks.  A  multi-receptacle  assignment  (section  4.6)  is 
translated  into  a  sequence  of  assignments  with  the  appro¬ 
priate  conversions,  the  only  complication — and  a  very 
minor  one  at  that— occurring  when  the  receptacles  are  a 
mixture  of  numeric  and  boolean. 

Among  the  straightforward  translations  are  an  exact 
duplication  of  the  CMS-2  fixed-point  scaling  rules.  Con¬ 
version  between  fixed-point  types  is  by  truncation  by 
default,  which  means  through  the  use  of  a  TRADA-gener- 
ated,  inefficient,  truncation  function.  The  user  can  speci¬ 
fy  that  a  simple  Ada  type  conversion  is  to  be  employed 
instead.  It  is  hoped  that  this  option  will  be  used  only  if 
the  user  verifies  that  the  algorithms  being  translated  are 
not  so  finely  tuned  that  a  one-bit  discrepancy  would  cause 
problems  or  if  it  is  known  that  the  Ada  compiler  being 
used  generates  code  that  does  (always)  convert  by  trunca¬ 
tion. 

7.1  Untranslatable  Constructs 

As  is  common  with  translators,  when  TRADA  encoun¬ 
ters  a  construct  it  cannot  translate,  it  outputs  the  construct 
as  a  comment,  along  with  a  special  comment  indicating 
that  user  intervention  is  required.  The  format  of  the 
comment  makes  it  easy  to  locate  using  an  editor. 

To  follow  this  approach  slavishly  would  reduce  the 
effectiveness  of  a  translation  of  CMS-2  because  of  the 
untranslatability  of  bit-twiddles.  Many  expressions,  and 
thus  the  statements  in  which  they  appear,  would  be 
marked  as  untranslatable. 

When  TRADA  encounters  a  bit-twiddle  while  translat¬ 
ing  an  expression,  it  is  translated  into  a  generated  object 
with  the  appropriate  type.  For  example,  a  word  reference 
is  translated  into  an  object  named  Word  reference  of  a 
type  that  is  appropriate  for  the  original  target  computer. 
Similarly,  a  reference  to  the  BIT  pseudo-function  is  trans¬ 
lated  as  Bit_reference_/eM,  where  len  is  the  length  of  the 
specified  bit  string.  Through  this  technique,  the  other 
semantics  of  the  expression,  such  as  any  implicit  type 
conversions,  and  the  statement  in  which  the  expression 
appears,  are  translated.  The  comment  that  user  interven¬ 
tion  is  required  is  also  output,  of  course. 

The  declarations  of  these  generated  objects  are  gath¬ 
ered  into  a  single  package. 


7.2  Array  Assignments 

The  translation  of  a  CMS-2  array  is,  in  general,  fairly 
straightforward.  A  type  is  generated  to  translate  the 
array’s  component  structure.  A  second  type  is  generated 
to  declare  the  array  type  itself.  (This  type  is  uncon¬ 
strained.)  This  second  type  is  then  used  to  declare  the 
translated  array  object. 

Following  this  simple  approach  would  result  in  trans¬ 
lating  many  otherwise  acceptable  CMS-2  array  assign¬ 
ments  into  invalid  Ada,  because  the  corresponding  Ada 
array  types  would  have  different  names.  In  order  to 
translate  array  assignments  whose  CMS-2  semantics 
match  those  of  Ada,  TRADA  analyzes  the  stmcture  of  all 
array  components  and  record  type  specifications  in  the 
program  being  translated  to  determine  common  structures. 
Using  the  information  from  this  analysis,  arrays  that  have 
common  component  structures  are  translated  using  a 
common  array  type  and  array  assignments  involving 
arrays  whose  components’  structures  are  different  are 
flagged  as  being  untranslatable. 

This  analysis  also  uncovers  assignments  of  arrays 
whose  components’  structures  are  not  identical.  These 
assignments  are  flagged  as  untranslatable. 

The  'smallest  controls*  semantics  of  CMS-2  array 
assignment  (section  4.7)  requires  some  array  assignments 
to  be  translated  using  slicing  or  loops. 

7.3  Indirect  Arrays 

Because  the  indirect  array  of  CMS-2  has  so  many  of 
the  properties  of  a  pointer  in  other  languages,  it  is  trans¬ 
lated  as  an  object  of  an  access  type  whose  designated 
object  is  the  corresponding  array  type.  The  only  transla¬ 
tion  difficulty  arises  when  an  indirect  array  is  set  to  point 
to  a  'real*  array,  one  that  is  not  indirect. 

TRADA’s  solution  to  this  problem  is  to  note  all  "real" 
arrays  that  are  pointed  to  at  any  point  in  the  program 
being  translated.  They  are  then  also  translated  into  the 
designated  objects  of  an  appropriate  constant  access  ob¬ 
ject.  The  object  declaration  includes  the  allocator  to 
create  the  designated  object.  If  the  real  array  includes 
initial  values  for  some  components,  this  initial  value  is 
supplied  either  as  a  qualified  expression  in  the  creating 
allocator  or  by  code  in  the  initialization  block  of  the  ap¬ 
propriate  package  body.  (The  choice  is  user  selectable.) 

This  translation  strategy  illustrates  why  it  is  necessary 
to  translate  an  entire  program  rather  than  a  program 


fragment.  A  program  fragment  might  contain  the  declara¬ 
tion  of  a  'real*  array  but  no  instances  of  its  being  pointed 
to.  It  would  therefore  be  translated  into  a  static  object 
and  this  translation  could  not  be  used  with  other  parts  of 
the  program  where  its  access  value  is  needed. 

(When  translating  into  Ada  9X,  the  same  analyses  must 
be  done  but  the  actual  translation  is  somewhat  simpler 
because  of  the  aliased  attribute.) 

7.4  Common  Enumeration  Types 

Because  of  the  very  loose  semantics  of  the  CMS-2 
enumeration  t3rpes,  TRADA  must  detect  enumeration  type 
specifications  that  specify  the  same  set  of  values  in  order 
to  effectively  translate  assignment  and  comparison  of 
enumeration  values.  (This  is  quite  similar  to,  but  much 
easier  than,  the  problem  of  detecting  common  array  com¬ 
ponent  structures.)  Common  enumeration  type  declara¬ 
tions  are  then  used  in  the  translated  code. 

If  assignments  or  comparisons  involve  CMS-2  enumer¬ 
ation  types  that  do  not  specify  the  same  set  of  values, 
TRADA  translates  by  forcing  a  conversion  of  the  Ada 
encoding,  using  code  of  the  form 

To_type'Val(Fron)_type'Pos(Value) ). 

If  the  source  type  has  more  values  than  the  destination 
type,  execution  of  this  expression  could  raise  a 
Constraint_error  exception.  This  is  an  example  of  a 
translation  that  is  possibly  inaccurate,  but  the  inaccurate 
translation  is  detectable  by  other  means. 

7.5  Named  Numbers 

CMS-2  named  numbers  are  translated  to  Ada  named 
numbers,  in  general.  For  the  most  part,  expressions 
using  named  numbers  are  left  as  expressions;  they  are  not 
folded  by  TRADA  into  static  values.  This  allows  the 
named  number  to  continue  being  used  in  Ada  as  a  source 
code  parameter. 

The  primary  exception  to  this  strategy  arises  when  a 
named  number  is  used  to  specify  an  attribute  of  a  numeric 
type,  such  as  the  number  of  fractional  bits  of  a  fixed-point 
type.  To  leave  the  type  parameterizable  in  the  Ada  code 
would  require  an  elaborate  translation.  One  aspect  of  this 
translation  would  be  an  execution-time  interpretation  of 
the  CMS-2  scaling  rules.  For  this  reason,  the  attributes 
of  numeric  types  are  fixed  at  translation  time,  even  if  they 
are  parameterized  in  the  CMS-2  code.  (Often  this  param¬ 
eterizing  of  numeric  types  comes  from  an  ill-considered 


adherence  to  the  rule  that  no  hard-coded  constants  should 
appear  in  a  program.) 

7.6  Boolean  0  and  1 

All  uses  of  CMS-2  named  numbers  ate  aiulyzed  to  see 
if  they  are  ever  used  in  a  boolean  context.  If  so,  they  are 
translated  as  boolean  constants,  rather  than  named  num¬ 
bers.  The  cases 

TRUE  EQUALS  1  $ 

FALSE  EQUALS  0  $ 

are  handled  specially.  They  are  not  translated  at  all  and 
references  are  translated  into  True  and  False  from  pack¬ 
age  Standard. 

In  the  unlikely  but  not  impossible  case  that  such  a 
named  number  is  also  used  as  an  integer,  the  form 

Boolean'Val (Value) 

is  employed.  Other,  more  elegant,  translations  are  possi¬ 
ble,  but  the  circumstance  is  so  unlikely  that  is  seems 
inappropriate  to  expend  effort  on  it.  (Cr^e  that  uses  the 
same  identifier  in  both  numeric  and  boolean  contexts 
should  probably  be  rethought.) 

7.7  Subprogram  Parameter  Passing 

Initially  TRADA  intended  to  mimic  the  CMS-2  param¬ 
eter  passage  faithfully:  each  CMS-2  subprogram  would  be 
translated  into  a  parameterless  Ada  subprogram,  each 
invocation  of  a  subprogram  would  be  preceded  by  assign¬ 
ment  statements  mimicking  the  assignment  of  the  input 
parameters  and  followed  by  assignment  statement  mimick¬ 
ing  the  assignment  of  the  output  parameters.  This  ap¬ 
proach  would  solve  the  problem  of  cross-parameter  inter¬ 
ference.  However,  it  was  recognized  early  on  that  it 
would  clutter  the  Ada  code  quite  a  bit.  This  was  particu¬ 
larly  true  in  evaluating  an  expression  that  contained  more 
than  one  function  reference,  when  it  would  be  necessary 
to  evaluate  the  function  references  in  the  proper  order  into 
temporary  locations  (which  would  need  largely  meaning¬ 
less  TRADA-generated  names),  each  evaluation  being 
preceded  by  the  parameter  assignment  sUtements.  This 
clutter  would  conflict  with  the  goal  of  producing  code  that 
is  at  least  as  maintainable  as  the  original  CMS-2. 

The  strategy  settled  on  is  to  have  formal  parameters  of 
mode  IN  corresponding  to  the  CMS-2  input  parameters 
and  formal  parameters  of  mode  OUT  corresponding  to  the 
CMS-2  output  parameters.  Subprogram  invocations  are 
then  translated  in  the  straightforward  manner.  Execution 
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of  the  body  of  the  subprogram  begins  with  assigning  the 
formal  input  parameters  to  the  external  objects  that  are  the 
translations  of  the  CMS-2  formal  input  parameters. 
Thereafter,  those  external  objects  are  referenced  rather 
than  the  Ada  formal  paiameteis.  For  procedures,  the 
values  of  the  external  objects  that  are  the  translations  of 
the  CMS-2  formal  output  paiameters  are  assigned  to  the 
corresponding  Ada  formal  output  parameters  immediately 
before  returning. 

This  strategy  translates  the  semantics  of  the  CMS-2 
subprogram  invocation  except  in  the  case  of  cross-parame¬ 
ter  interference.  TRADA  analyzes  all  invocations  for 
such  interference  and  outputs  a  warning  message  when  it 
is  detected,  calling  for  post-translation  atulysis  and  modi¬ 
fication.  Cross-parameter  interference  occurs  rarely  and 
is  a  poor  programming  practice  when  it  is  used.  It  is 
better  to  handle  it  this  way  than  to  faithfully  simulate  it 
with  elaborate  code  sequences  that  are  difficult  to  main¬ 
tain. 

7.8  Procedure  Abnormal  Exits 

Procedures  with  exit  parameters  are  translated  into 
procedures  with  an  ‘extra'  parameter  of  mode  OUT. 
This  parameter  is  of  an  enumeration  type  which  has  one 
value  corresponding  to  each  exit  parameter  and  one  value 
corresponding  to  a  "normal*  return.  In  the  body  of  the 
translated  procedure  this  parameter  is  set  appropriately 
depending  on  the  return  statement  executed.  The  param¬ 
eter’s  value  is  then  tested  following  each  invocaMon  to 
determine  the  next  statement  to  be  executed. 

In  order  to  avoid  changing  the  values  of  the  actual 
output  parameters  when  an  ‘abnormal'  exit  is  taken,  the 
mode  of  the  formal  output  parameters  is  made  IN  OUT 
and  the  assignment  of  the  external  objects  to  the  corre¬ 
sponding  formal  output  parameters,  described  above,  is 
not  done  when  an  abnormal  exit  is  translated.  The  mode 
ensures  that  the  values  are  not  changed  for  parameters 
that  use  the  copy  in/copy  back  mechanism.  (If  the  mode 
were  left  as  OUT,  the  uninitialized  bit  pattern  that  fills  the 
stack  location  allocated  to  a  formal  output  parameter 
would  be  copied  to  its  corresponding  actual  parameter  at 
the  conclusion  of  execution.  When  the  mode  is  IN  OUT, 
that  stack  location  is  filled  with  the  value  of  the  actual 
parameter  at  the  begiiuiing  of  execution.) 

7.8  Procedure  Switches 

A  procedure  switch  is  translated  into  a  procedure, 
whose  name  is  the  switch  name.  The  formal  parameters 
of  this  procedure  correspond  to  the  switch's  formal  pa¬ 


rameters,  a?cording  to  the  same  scheme  as  for  translating 
procedure  formal  parameters.  The  body  of  the  procedure 
contains  a  case  statement  or  an  if-elsif-else  sequence  to 
accomplish  the  swdch.  The  form  of  the  body  is  deter¬ 
mined  by  such  considerations  as  switch  value  density,  but 
in  general  indexed  switches  give  rise  to  case  statements 
and  item  switches  give  rise  to  if-elsif-else  sequences. 

For  an  indexed  procedure  switch,  the  corresponding 
procedure  has  an  'extra'  input  parameter,  which  is  the 
value  to  be  used  in  making  the  switch. 

If  the  switch  is  ever  invoked  using  an  invalid  clause, 
the  corresponding  procedure  has  an  'extra'  boolean  out¬ 
put  parameter,  which  is  set  according  *o  the  validity  of  the 
switch  value. 

7.10  Loops 

Because  of  the  free  use  that  can  be  made  of  a  loop 
index  in  CMS-2,  a  simple  indexing  loop  cannot  be  easily 
translated  into  a  FOR-loop  in  Ada,  where  the  use  of  the 
index  is  very  restricted.  Before  such  a  translation  could 
be  made,  a  heavy  analysis  of  all  uses  of  the  index  would 
have  to  be  done,  checking  for  such  things  as  updating  the 
index,  accessing  the  index  from  within  a  subprogram  that 
is  invoked  (directly  or  indirectly)  in  the  loop,  use  of  the 
loop  value  after  loop  termination,  etc.  Rather  than  invest 
in  this  analysis,  TRADA  translates  such  loops  into  'infi¬ 
nite*  loops,  with  explicit  incrementation  and  termination 
testing  at  the  bottom. 

A  simple  WHlLE-loop  in  CMS-2  is  translated  into  a 
WHlLE-loop  in  the  obvious  fashion.  However,  a  loop- 
UNTIL  must  be  translated  into  an  'infinite*  loop  with  an 
explicit  test  at  the  bottom.  The  "infinite"  loop  with  ex¬ 
plicit  termination  tests  must  also  be  used  when  the  CMS-2 
loop  has  a  combination  of  termination  conditions,  such  as 
multiple  indexes,  an  index  and  an  UNTIL-condition,  etc. 

Even  the  "infinite"  loop  is  too  restrictive  when  the 
CMS-2  loop  is  resumed  from  outside.  In  this  case  the 
looping  structure  is  controlled  by  goto  statements,  to 
allow  the  translation  of  the  resume  statement  to  branch 
into  the  loop. 

7.11  Labels,  Goto  Statements,  and  Label  Switches 

Labels  and  goto  statements  are  translated  directly. 
Because  of  the  free  transfer  of  control  allowed  in  CMS-2, 
this  can  result  in  illegal  Ada  goto  statements.  The  error 
messages  produced  by  the  Ada  compiler  will  be  an  ade¬ 
quate  signal  that  human  intervention  is  required. 
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In  general,  a  label  switch  is  translated  as  a  case  state¬ 
ment  or  if-elsif-else  statement  with  a  goto  statement  in 
each  alternative.  Whenever  TRAD  A  can  verify  that  a 
code  sequence  can  only  be  reached  by  a  transfer  through 
a  label  switch,  that  sequence  is  moved  into  the  appropriate 
alternative  of  the  case  statement. 

7.12  Data  Local  to  Subprograms 

In  general,  data  local  to  subprograms  are  translated 
into  regional  data  in  the  subprogram’s  package,  in  order 
to  preserve  their  staticness.  Such  a  datum’s  name  is 
modified  by  appending  the  subprogram’s  name  in  order  to 
avoid  name  clashes,  if  necessary. 

To  avoid  this  cluttering  of  names  and  to  make  effective 
use  of  the  Ada  stack,  TRADA  identifies  local  data  that 
are  always  assigned  before  they  are  referenced.  In  the 
translation,  these  data  remain  local  to  their  subprogram. 

7.13  Initial  Values 

Data  that  are  given  initial  values  in  CMS-2  are  given 
those  values  in  the  TRADA  output.  For  a  variable  of  the 
basic  types,  the  initial  value  is  specified  as  part  of  the 
corresponding  object  declaration.  This  technique  can  also 
be  used  for  an  array,  where  the  object  declaration  in¬ 
cludes  an  aggregate  specifying  the  initial  value.  For  a 
large  array,  the  aggregate  might  well  exceed  the  capacity 
of  the  Ada  compiler  being  used.  To  avoid  this  problem, 
a  TRADA  option  allows  the  user  to  specify  that  arrays 
should  be  given  their  initial  values  through  the  execution 
of  assignment  statements.  For  a  global  or  regional  array, 
these  executable  statements  are  placed  in  the  initialization 
block  of  the  array’s  package  and  are  thus  executed  only 
at  program  startup. 

The  user  can  specify  whether  data  that  are  not  given 
initial  values  should  be  left  indeterminate  or  ‘'zeroed”,  to 
mimic  the  de  facto  zeroing  of  the  CMS-2  linkers.  If 
"zeroing"  is  chosen,  numeric  data  are  given  the  value 
zero,  character  data  are  given  ASCII. NUL,  and  enumera¬ 
tion  data  are  given  their  type’s  first  value. 

When  aggregates  are  used  to  initialize  array  compo¬ 
nents,  the  above  "zeroing"  technique  is  used  for  compo¬ 
nents  that  have  no  specified  initial  value. 

7.14  Typed  Records 

A  record  type  that  has  an  associated  basic  type  is  trans¬ 
lated  into  an  Ada  record  type.  When  translating  an  object 
of  this  type,  a  "shadow"  object  of  the  translated  basic  type 


is  also  created,  and  a  comment  calling  for  user  interven¬ 
tion  is  also  output.  References  to  the  whole  object  are 
then  translated  as  references  to  the  "shadow"  object. 

7.15  Conditional  Compilation 

Two  features  of  the  CMS-2  conditional  compilation 
feature  make  translation  into  "parameterized"  Ada  impos¬ 
sible:  the  fact  that  the  code  in  a  block  being  ignored  might 
not  be  correct  and  the  possibility  of  declaring  an  identifier 
multir.'e  times  in  distinct  blocks.  For  this  reason, 
TRADA  translates  only  one  configuration  of  the  program, 
the  one  corresponding  to  the  settings  of  the  compilation 
flags  at  the  time  of  translation. 

8.  A  TRADA  Shortcoming 

The  unusual  nature  of  subprogram  formal  parameters 
gives  rise  to  one  tractable  problem  that  TRADA  does  not 
attempt  to  handle.  If  an  expression  contains  two  or  more 
function  references,  then  the  execution  of  one  of  those 
references  could  affect  the  parameter  passage  of  a  subse¬ 
quent  one.  This  can  happen  two  ways:  during  the  param¬ 
eter  passage  stage  of  the  earlier  function  reference  or 
during  execution  of  its  body.  At  either  of  these  times,  a 
value  could  be  changed  that  plays  a  role  in  the  actual 
parameters  of  the  later  reference,  either  by  changing  an 
actual  parameter  or  by  changing  a  value  that  is  used  to 
select  an  actual  parameter,  such  as  a  value  that  occurs  in 
a  subscript  expression. 

Although  it  is  not  possible  to  detect  this  inter-function 
parameter  interference  with  100%  certainty,  the  problem 
could  be  solved  by  breaking  the  expression  apart  to  make 
certain  that  the  translated  function  references  occur  in  the 
same  order  as  in  the  CMS-2  code.  As  was  discussed  in 
section  7.7,  this  can  result  in  obscure  code  that  is  difficult 
to  maintain.  Since  using  this  "feature"  is  an  unconsciona¬ 
ble  programming  practice  and  its  uses  are  extremely  rare, 
it  was  decided  to  aim  for  the  readability/maintainability 
goal  rather  than  bulletproof  translation  in  this  case,  even 
though  it  is  not  possible  to  give  the  usual  warning  that 
intervention  is  required. 

This  is  the  only  known  case  in  which  TRADA  will 
generate  possibly  incorrect  Ada  without  either  a  warning 
message,  an  Ada  compiler  error,  or  an  exception  during 
execution  of  the  translated  code. 

9.  Summary 

TRADA  is  a  translator  driven  by  the  goals  of  conser¬ 
vative  translation  (either  faithfully  reproducing  the  seman- 
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tics  of  a  construct  or  marking  it  as  untranslatable,  requir¬ 
ing  user  intervention),  producing  maintainable  code,  and 
producing  transportable  code.  Of  the  three  goals,  conser¬ 
vative  translation  is  the  most  important.  With  it,  the 
user’s  part  of  the  translation  effort  can  be  attacked  without 
worrying  about  the  part  done  by  TRADA. 

TRADA  achieves  its  goals  to  a  high  degree.  It  often 
uses  full  knowledge  of  the  entire  program  being  translated 
in  order  to  chose  its  translation  strategy,  where  relying  on 
only  local  knowledge  might  lead  to  an  incorrect  transla¬ 
tion.  Choices  between  alternative  translation  strategies 
have  been  made  on  the  basis  of  maintainability  and  knowl¬ 
edge  of  the  de  facto  uses  of  the  construct  being  translated. 
Lastly,  it  refuses  to  translate  into  non-portable  Ada  or  to 
translate  CMS-2  constructs  whose  semantics  are  not  fully 
known. 


Ada  is  defined  in 

ANSI/MIL-STD-1815A-1983,  Ada  Programming 
Language,  Department  of  Defense,  Washington,  D. 
C.,  January  1983. 
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Abstract 

The  paper  describes  a  methodology  for  translating 
concurrent  software  into  Ada,  where  the  concurrency  is  ex¬ 
pressed  in  Operating  Systems  calls  embedded  in  the  source 
software.  There  are  two  important  advantages  to  such  a 
translation.  First,  the  understanding  and  analysis  of  concur¬ 
rency  expressed  through  Operating  System  calls  is  very 
complex  and  difficult.  Translating  these  calls  (as  well  as  the 
sequential  portions)  into  Ada  greatly  simplifies  the  analy¬ 
sis,  and  understanding  of  the  software  by  providing  the  op¬ 
erational  semantics  of  the  calls.  Second,  the  translation  to 
Ada  eliminates  dependence  on  Operating  Systems. 

The  focus  of  the  paper  is  on  translation  of  concurrency 
related  software  portions  of  the  source  software  into  Ada.  It 
is  part  of  a  larger  system  for  translating  both  sequential  and 
concurrent  aspects  of  real-time  applications  software. 
Translating  the  sequential  portions  of  the  source  code  into 
Ada  has  been  documented  separately  in  previous  reports  [4, 
13].  These  reports  provide  background  to  the  presently  re¬ 
ported  work. 

The  immediate  motivation  for  the  work  described 
here  is  due  to  the  need  to  automatically  translate  Navy  soft¬ 
ware  in  CMS-2  into  Ada,  where  the  concurrency  is  ex¬ 
pressed  by  calls  to  the  SDEX-20  or  ATES  Operating  Sys¬ 
tems.  Much  of  the  Navy  mission-critical  systems  use  these 
languages.  The  translation  to  Ada  is  needed  to  modernize 
these  systems. 

The  methodology  we  propose  for  replacing  Operating 
System  calls  by  equivalent  Ada  code  uses  a  standard  mes¬ 
sage-based  kernel  oriented  architecture.  This  architecture 
is  specialized  for  each  Operating  System  and  modularly  ex¬ 
tended  to  represent  the  individual  features  (concurrency 
control,  input/output,  etc.)  supported  by  that  Operating  Sys¬ 
tem.  Within  this  architecture  the  determination  of  a  generic 
method  for  translating  into  Ada  a  new  Operating  System 
call,  though  non-elementary,  is  not  overly  time  consuming 
(we  estimate  less  than  one  week  of  programming  for  each 
additional  Operating  System  call).  The  use  in  the  transla¬ 
tion  of  a  standard  message-based  architecture  has  a  cost  in 


lost  performance  with  respect  to  the  performance  attainable 
in  an  optimal  ad  hoc  translation  using  a  shared  memory 
model.  Upper  bounds  on  this  performance  cost  can  be  de¬ 
termined,  and  can  be  used  to  establish  the  characteristics  of 
the  hardware  required  to  achieve  specific  real-time  dead¬ 
lines. 

T.  .iislation  is  based  on  the  notion  of  Junc¬ 

tional  equivalence  of  the  source  and  target  software.  It  is 
attained  by  adhering  closely  in  the  target  Ada  software  to 
the  data,  data  layouts,  software  units  and  data  transforma¬ 
tions  in  the  source  software.  In  some  cases,  additional  in¬ 
formation,  typically  found  in  a  software  specification,  is 
needed  to  achieve  programs  that  are  free  of  hardware  and 
scheduling  dependencies. 

The  translation  uses  an  intermediate  entity-relation- 
attribute  graphic  model  of  the  target  Ada  software  which  is 
created  and  progressively  enriched.  It  serves  as  a  repository 
for  all  the  information  extracted  from  the  source  software 
and  the  software  specification.  The  graphic  software  model 
is  then  used  as  a  basis  for  generating  the  target  Ada  soft¬ 
ware,  for  analysis  and  abstracting  of  the  software  and  for 
generating  an  up-to-date  software  specification. 

The  concurrency  translation  to  Ada  is  described  in  this 
paper  in  sufficient  detail  for  executing  it  manually,  or  im¬ 
plementing  it  automatically.  The  methodology  utilizes 
templates  for  data  structures,  tasks  and  procedures  for  trans¬ 
lating  into  Ada  of  multiple  Operating  Systems.  Additional¬ 
ly,  for  each  Operating  System  call,  it  is  necessary  to  com¬ 
pose  templates  of  procedures  that  execute  the  respective 
protocols.  The  templates  are  placed  in  the  Ada  library,  pro¬ 
vided  in  [6].  It  includes  the  generic  templates  and,  as  an  ex¬ 
ample,  templates  for  seven  Unix  calls.  These  templates 
provide  a  general  fi’amework  for  the  translation  of  concur¬ 
rency  related  Operating  System  calls  to  Ada. 
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A  concurrent  software  example  in  C  using  Unix  Oper¬ 
ating  System  calls  is  provided  in  the  Appendix.  It  illustrates 
synthesis  of  the  Ada  templates  and  generation  of  a  complete 
Ada  program.  The  choice  of  C  and  Unix  is  due  to  the  famil¬ 
iarity  of  readers  with  these  languages. 

Needed  future  work  is  also  described.  It  consists  of: 

(i)  Extending  the  present  approach  to  distributed  proces¬ 
sing,  using  Ada-9X.  (ii)  Extending  the  target  software  to 
assure  mutual  exclusion,  progress,  and  limited  postpone¬ 
ment,  independently  of  the  hardware  speeds  and  scheduling 
that  are  used,  (iii)  Selecting  new  target  hardware,  to  ensure 
that  timing  requirements  are  met. 

1.  Introduction 

This  paper  describes  the  translation  of  concurrency- 
related  Operating  System  calls  to  Ada.  Concurrency-re¬ 
lated  Operating  Systems  calls  generally  have  been  infor¬ 
mally  documented.  This  has  made  understanding  and 
analysis  of  such  concurrent  programs  extremely  difficult. 
The  translation  to  Ada  gready  simplifies  complete  presen¬ 
tation  and  understanding  of  the  role  of  concurrency.  Fur¬ 
ther,  the  software  becomes  independent  of  Operadng  Sys¬ 
tems  used  with  the  hardware. 

The  immediate  modvadon  for  the  work  described  in 
this  paper  has  been  due  to  the  need  for  automadc  transladon 
of  concurrent  real-dme  Navy  software  in  CMS-2  into  Ada. 
Embedded  in  the  CMS-2  code  are  concurrency-related 
calls  to  ATES  or  SDEX-20  Operadng  Systems.  Such  pro¬ 
grams  are  widely  used  in  Navy  mission-cridcal  applica- 
dons.  The  transladon  is  needed  for  modemizadon  of  these 
systems. 

Sdll  another  modvadon  has  been  due  to  the  reladvely 
new  field  of  Software  Reengineering.  Its  objecdve  is  to  pro¬ 
cess  existing  software  automadcally,  or  semi-manually,  in 
order  to  obtain  modem  software  for  the  same  applicadon  or 
new  applications  for  execudon  on  a  high  speed  distributed 
network. 

The  overall  reengineering  process  involves  code 
transladon  and  creadon  of  a  software  model  that  is  used  to 
provide  a  number  of  capabilides.  These  include  software 
analysis,  facilitation  of  software  understanding,  documen- 
tadon  of  the  software,  reorganization  and  restructuring  of 
the  software  and  interfacing  it  with  other  software. 

The  transladon  of  concurrency-related  Operating 
System  call  is  an  integral  part  of  providing  these  capabili- 
ties.The  transladon  of  concurrency-related  Operating  Sys¬ 
tem  calls  to  Ada  is  a  step  in  a  larger  process.  The  software 
translation  process  consists  of  several  progressive  steps,  il¬ 


lustrated  in  Figure  1  for  the  transladon  of  CMS-2  to  Ada. 
They  are  as  follows  [14]: 

(i)  transladon  of  sequendal  pordons  of  the  software  into 
Ada  [4, 13]; 

(ii)  transladon  of  the  concurrency  related  statements  of 
the  software,  expressed  in  Operadng  System  calls, 
into  Ada  (this  is  the  topic  of  this  paper); 

(iii)  storage  of  a  graphic  mode)  of  the  Ada  target  software 
in  a  database  and  its  display  and  modificadon  [7,13]; 

(iv)  generadon  of  a  software  specification  [S]; 

(v)  enhancing  concurrent  software  so  that  its  logical  cor¬ 
rectness  is  independent  of  processor  speed  (requires 
further  work;  see  Secdon  6); 

(vi)  simuladng  scheduling  of  concurrent  software  to  de¬ 
termine  the  new  hardware  resources  needed  to  meet 
dming  requirements  [10] .  The  hardware  resources 
may  be  in  a  distributed  processors  and  communica- 
dons  network  (requires  further  work,  see  Secdon  6). 

The  paper  focus  is  only  on  item  (ii) — the  transladon  of 
concurrency  expressed  in  Operadng  Systems  calls  into 
Ada.  The  paper  describes  this  step  in  sufficient  detail  to 
have  it  performed  manually,  or  to  program  it.  Steps  (i),  (iii), 
and  (iv)  have  been  implemented  previously  and  are  re¬ 
ported  in  respecdve  references.  Step  (v)  and  (vi)  require 
further  research. 

This  paper  contains  the  following  secdons. 

Secdon  2  is  a  technical  problem  statement.  It  esta¬ 
blishes  the  requirements  for  the  transladon  by  defining  the 
funcdonal  equivalence  of  the  source  and  target  software. 

Secdon  3  describes  the  graphic  model  of  the  software. 
It  is  created  as  the  output  of  transladng  the  source  software, 
adhering  closely  to  its  data,  data  layouts,  software  units 
(tasks,  procedure,  funcdons),  data  transformations,  and 
their  precedences  and  interacdons.  The  use  of  a  graphic 
software  model  is  important  here  because  it  is  needed  for 
the  progressive  transladon  steps  and  because  it  is  used  later 
for  analysis,  understanding  and  creadng  an  up-to-date 
specificadon  of  the  software. 

Secdon  4  presents  the  transladon  process.  It  consists 
of  generadng  Ada  tasks  and  procedures  to  implement  con¬ 
currency-oriented  Operadng  Systems  calls.  The  transla¬ 
tion  replaces  concurrency-related  Operadng  System  calls 
with  respecdve  Ada  tasks,  procedures  and  messages.  It 
creates  tasks  for  the  source  program  processes  and  for  syn¬ 
chronizing  these  processes. 

Section  S  describes  the  translation  process  in  greater 
detail.  It  uses  an  Ada  template  library  for  synthesizing  the 
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concurrent  aspects  of  the  code.  The  Ada  library  contains  the 
code  needed  generally  for  translating  concurrency-related 
Operating  System  calls  and  also  example  procedures  for 
translating  several  Unix  Operating  System  calls.  The  Ada 
library  is  given  in  [6].  The  methodology  is  applicable  to  oth¬ 
er  source  software  languages  and  Operating  Systems  (e.g. 
CMS-2  with  ATES  and  SDEX-20). 

The  translation  of  Operating  System  calls  is  only  part¬ 
ly  feasible  in  some  instances.  To  illustrate  this  point,  the  pa¬ 
per  includes  in  the  Appendix  an  example  of  the  translation 
of  the  Unix  fork  call  into  Ada.  The  computational  model  of 
Unix  differs  greatly  from  the  computational  model  of  Ada. 
For  example,  Unix  tasks  are  much  "larger”  granular  objects 
than  Ada  tasks:  a  Unix  task  has  its  own  address  space  and 
maintains  by  default  information  about  open  files,  related 
tasks,  and  signals;  no  such  information  is  available  by  de¬ 
fault  with  Ada  tasks  and  in  most  implementations  Ada  tasks 
share  the  same  address  space.  More  fundamentally,  an  Op¬ 
erating  System  controls  tasks  switching  and  can  make  deci¬ 
sions  when  dispatching  a  task  from  the  Ready  state  to  the 
Running  state  (examples  of  such  decisions  are  to  suspend  or 
to  terminate  a  task).  No  such  fine  grained  control  is  possible 
in  Ada  since  no  facilities  are  provided  by  the  language  to 
control  task  switching  (Ada  9X  will  correct  to  an  extent  this 
problem);  even  the  drastic  "Abort"  statement  does  not  take 
effect  until  the  aborted  task,  on  its  own,  reaches  a  synchro- 
nixation  point.  One  objective  of  the  work  repotted  here  is  to 
clarify  and  bridge  over  such  differences. 

The  appendix  refers  to  templates  for  Unix  concurren¬ 
cy-related  calls.  Seven  of  these  calls  are  shown  as  having 
been  defined  by  respective  procedures.  The  implementa¬ 
tion  of  procedures  for  all  of  the  Unix  calls  and  their  place¬ 
ment  in  the  Ada  library,  is  required  for  attaining  the  full 
translation  of  Unix  into  Ada.  The  appendix  shows  the  use  of 
the  Ada  templates  in  the  library  for  translating  an  example 
source  software  in  C  with  Unix  calls  into  Ada. 

Section  6  concludes  the  paper  with  a  description  of  fu¬ 
ture  needed  research  and  development  as  follows.  The  ob¬ 
jective  of  the  translation  is  to  be  able  to  execute  the  pro¬ 
duced  Ada  software  on  new  distributed  hardware.  The 
currently  presented  translation  is  restricted  to  use  of  a  single 
processor  and  Ada.  It  needs  to  be  extended  to  use  of  distrib¬ 
uted  processing  and  Ada-9x.  It  is  also  necessary  fo  modify 
the  target  software  to  be  logically  independent  of  timing  de¬ 
pendencies  due  to  the  hardware.  Finally,  it  is  necessary  to 
determine  the  processing  and  communication  capacities 
needed  to  satisfy  the  timing  requirements. 

2.  Technical  Problem  Statement 

The  basic  translation  requirement  is  to  produce  Ada 
target  software  which  is  functionally  equivalent  to  the 


source  software.  Functional  equivalence  is  defined  as  fol¬ 
lows: 

A  Software  specification  defines  legal  input  se¬ 
quences  and  respective  outputs  of  the  software,  as  well  as 
additional  requirements.  Functional  equivalence  of  the 
source  and  target  software  of  the  translation  means  that  they 
both  conform  to  the  software  specification.  They  both  con¬ 
sume  legal  input  sequences  and  produce  respective  legal 
outputs.  A  software  specification  may  not  be  reliable  or 
even  available.  Instead,  it  is  preferable  to  use  the  source 
software  whenever  possible.  The  source  software  that  is 
used  as  input  to  the  translation  is  assumed  to  be  a  well- 
tested,  extensively  used  and  highly  reliable  representation 
of  the  software  specification,  though  it  may  be  incomplete 
in  some  of  the  cases  described  below.  If  the  source  software 
is  modified  prior  to  the  translation,  then  it  is  required  that  it 
be  tested  or  shown  not  to  affect  the  respective  computations. 

(i)  Sequential  Software:  In  this  case  the  source  soft¬ 
ware  is  a  complete  representation  of  the  specification  (as¬ 
suming  no  timing  requirements).  The  source  software  de¬ 
fines  all  the  precedences  and  operations  needed  to  process 
legal  input  sequences  and  produce  respective  legal  outputs. 
The  target  Ada  software  is  then  functionally  equivalent  to 
the  source  software  as  it  adheres  to  the  latter’s  precedences 
and  operations. 

(ii)  Concurrent  Software: 

(a)  Source  Software  where  the  logical  correctness 
is  independent  of  the  hardware  speed  and  Operating 
System:  In  this  case  the  source  software  also  defines  all  the 
concurrent  execution  threads  and  the  synchronizations 
needed  for  accepting  legal  inputs  and  producing  legal  out¬ 
puts,  as  defined  in  the  software  specification .  For  given  in¬ 
puts  execution  of  the  target  Ada  software  may  produce  dif¬ 
ferent  outputs  from  those  produced  by  processing  the  source 
software,  but  still  the  target  software  is  functionally  equiva¬ 
lent  to  the  source  software  as  the  outputs  still  comply  with 
the  software  specification.  For  example,  the  source  and  tar¬ 
get  software  may  execute  a  same  sequence  of  inputs,  the 
former  may  produce  sequences  of  different  outputs  than  the 
latter,  but  they  both  conform  with  the  software  specifica¬ 
tion. 

(b)  Source  Software  where  the  logical  correctness 
is  dependent  on  the  source  hardware  speed  and  Operat¬ 
ing  System;  The  source  software  programmer  may  have  re¬ 
lied  on  delays,  due  to  relative  speeds  of  the  hardware  in 
executing  portions  of  the  software,  and  has  omitted  entering 
explicitly  the  respective  synchronizations  in  the  code.  In 
this  case  the  source  software  is  lacking  some  synchroniza¬ 
tion  statements  to  retain  the  logical  correctness  independent 


of  the  hardware  and  Operating  System  used.  These  syn¬ 
chronization  statements  must  be  added  to  either  the  source 
or  the  target  Ada  software  in  order  to  comply  with  the  soft¬ 
ware  specification.  (This  is  further  discussed  in  Section  6.2, 
as  future  research.) 

(iii)  Thning:  The  timing  requirements  are  docu¬ 
mented  in  the  software  specification,  but  typically  not  in  the 
source  software.  The  needed  capacity  of  the  network's  pro¬ 
cessors  and  communications,  must  be  determined  for 
executing  the  Ada  target  software  while  guaranteeing  the 
timing  requirements  (This  is  further  discussed  in  Section 
6.3,  as  future  research). 

Note  that  satisfying  item  (i)  above  is  performed  by 
translating  the  sequential  software  in  the  top  box  in  Figure 
1.  It  has  been  reported  previously  [4,13].  This  is  adequate 
for  purely  sequential  source  software.  However,  by  itself  it 
provides  an  incoherent  concept  of  the  software  if  the  soft¬ 
ware  contains  concurrencies.  Satisfying  items  (ii)(a)  above 
is  performed  in  the  next  box  in  Figure  1.  This  is  the  main 
topic  of  this  paper.  Satisfying  items  (ii)(b)  and  (iii)  above  is 
performed  in  the  third  and  last  boxes  in  Figure  1 .  They  de¬ 
pend  on  the  existence  of  information  not  directly  available 
by  inspecting  the  source  code.  In  particular  (ii)  (b)  requires 
the  availability  of  specifications  that  clarify  what  are  the 
concurrency  constraints  so  that  we  can  determine  which  are 
enforced  by  the  source  code,  and  which  ate  voided  by  as¬ 
sumptions  made  about  hardware  speed  and  scheduling  dis¬ 
cipline.  These  items  ate  'dscussed  in  Section  6  as  needing 
future  research. 

To  achieve  funcional  equivalence,  the  translation  is 
based  on  very  close  adherence  to  the  memory  and  trans¬ 
formations  of  the  sot  tee  software.  Instead  of  relying  on  the 
software  specificatic  n,  we  rely  in  items  (i)  and  (ii)(a)  above 


on  the  software  code  which  has  been  assumed  to  be  a  tested, 
tried-out,  reliable,  and  machine-readable  representation  of 
the  software  specification.  The  information  from  the  source 
software  is  stored  progressively  in  an  Ada-oriented  graphic 
software  model. 

Close  adherence  to  the  source  software  means  that  all 
the  entities,  operations,  and  precedeiuxs  of  the  source  soft¬ 
ware  are  represented  in  the  graphic  software  model  using 
Ada  semantics.  They  are  as  follows. 

(i)  Declarations  of  same  named  variables  as  declared  in 
source  software,  using  the  same  memory  layout,  and 
same  scope  (same  shared  memory). 

(ii)  An  Ada  task  and  communications  to  perform  the 
functions  of  the  Operating  System  calls  used  in 
source  software. 

(iii)  Ada  tasks  to  represent  each  of  the  concurrent  pro¬ 
cesses  in  the  source  software. 

(iv)  Ada  procedures  and  functions  to  represent  respective 
procedures  and  functions  in  the  source  software. 

(v)  Ada  VO  to  represent  VO  devices  in  the  source  soft¬ 
ware. 

(vi)  Ada  transformation  statements  (executable  state¬ 
ments)  for  each  data  or  control  transformation  state¬ 
ment  in  the  source  software.  They  perform  the  same 
operations  in  Ada  in  the  same  sequential  order. 

The  Ada-oriented  graphic  model  also  contains  di¬ 
rected  edges  between  the  above  nodes/entities  to  indicate 
precedences  in  execution  of  statements  as  well  as  other  rela¬ 
tions  enumerated  below. 
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Figure  1:  The  Six  Steps  in  Reengineering  CMS-2  Software  with  Cails  to  ATES  or  SDEX-20 

into  Ada 
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3.  A  Graphic  Model  of  the  Target  Ada  Soft¬ 
ware 

The  main  translation  activity  consists  of  progressively 
creating  from  the  source  software  a  graphic  software  model 
which  employs  Ada  semantics.  This  model  contains  first 
the  translation  into  Ada  of  the  sequential  portions  of  the 
source  software.  Then  it  is  augmented  by  adding  the 
translation  of  concurrency  related  Operating  System  calls. 
Later  the  graph  is  further  modified  and  extended  progres¬ 
sively,  without  affecting  the  declarations  or  order  of  the  op¬ 
erations  of  the  source  software.  For  example,  changes  are 
made  to  partition  overall  large-scale  software  into  hierar¬ 
chical  software  units.  The  graphic  model  is  also  reorga¬ 
nized  and  restructured  to  conform  to  the  object-oriented 
Ada  programming  paradigm.  The  graphic  software  model 
includes  tracings  to  the  source  statements  whose  translation 
caused  the  respective  target  Ada  statements.  Although  the 
target  software  may  be  reordered,  the  respective  source 
statements  can  be  located  and  the  translation  remains  ex¬ 
plainable  and  Justifiable. 

A  summary  of  the  contents  of  the  graphic  software 
tiKxlel  is  given  in  the  following: 

The  grtqihic  model  representation  is  called  an  Elemen¬ 
tary  Statement  Language  for  Ada  (ESL-Ada).  ESL-Ada  is 
a  graphic  language  to  describe  Ada  code  [4].  Every  Ada 
statement  forms  a  node  in  the  graph.  Thus,  there  are  nodes 
for  declarations  of  data,  software  unit  types  (tasks  and  pro¬ 
cedures),  generics  and  their  instantiations.  There  are  also 
nodes  for  execution  statements  (e.g.  assignments,  condi¬ 
tions,  loops,  etc.).  A  node  includes  attributes  of  the  respec¬ 
tive  statement  such  as  the  parsed  statement,  a  trace  to  a 
source  statement,  and  an  icon  for  the  node’s  display.  Thus, 
all  the  statements  in  the  source  software  are  represented 
progressively  in  the  graphic  software  model  by  nodes. 

An  ESL-Ada  graph  contains  edges  which  connect  the 
nodes  to  form  a  tree.  A  tree  branch  from  a  node  to  another 
node  at  the  same  software  hierarchical  level  means  that  the 
statement  of  the  first  node  immediately  precedes  the  state¬ 
ment  of  the  latter  node.  A  branch  from  a  higher  software 
hierarchical  level  block  statement  node  to  a  chain  of  the 
next  lower  level  nodes  represents  lexical  containment.  This 
type  of  edge  is  called  a  scope  tuple.  These  edges  specify  all 
the  precedences  derivable  from  the  source  program. 

Additional  edges  are  then  created  between  pairs  of 
nodes  to  represent  binary  relations  between  nodes.  These 
edges  are  also  called  tuples.  There  are  six  additional  types 
of  tuples  indicating  respective  relationships.  These  tuples 
are  used  in  reordering  and  reorganizing  code  and  for  creat¬ 


ing  software  absuactions  for  documenting  the  software. 
They  will  also  be  used  for  analysis  of  mutual  exclusion  and 
simulation  of  the  timing  (see  Section  6).  They  are  as  fol¬ 
lows; 

(i)  a  relation  between  a  statement  that  references  or  up¬ 
dates  a  variable  and  a  statement  that  declares  the 
variable  {memory  tuple): 

(ii)  a  relation  between  a  procedure  call  statement  and  the 
procedure's  declaration  {call  tuple); 

(iii)  a  relation  between  a  message  call  statement  and  the 
called  entry  point  declaration  {message  tuple);  this 
includes  synchronization  calls; 

(iv)  relation  between  an  i/o  call  and  the  respective  device 
declaration;  {i/o  tuple); 

(v)  relation  between  a  generic  or  type  declaration  and 
respective  instantiation  statement  {type  tuple); 

(vi)  relation  between  a  specification  and  its  respective 
body  declaration,  or  between  with/use  statement  and 
the  respective  package  declaration  {context  tuple). 

Each  tuple/edge  has  attributes,  such  as  its  type,  icon, 
etc. 

A  specification  of  the  sequential  translation  process, 
for  a  specific  source  language,  requires  that  the  syntax  and 
semantics  of  each  construct  in  the  source  language  are  given 
a  syntactic  and  semantic  representation  in  Ada.  This  has 
been  done  so  far  for  CMS-2  [4]. 

The  translation  of  Operating  System  calls  consists  of 
their  representation  in  Ada  in  the  graphic  software  model. 
As  will  be  discussed,  the  translation  requires  using  Ada 
tasks  and  procedures  for  Operating  System  calls. 

The  graph  is  stored  as  an  Entity-Relation-Attribute 
database  [2],  where  the  nodes  are  the  entities  and  the  rela¬ 
tions  between  them  are  the  edges.  The  graphic  software 
model  is  stored  in  a  graphic  database.  Retrieval  queries  and 
display  allow  selective  viewing  of  graphic  views  of  the  soft¬ 
ware  from  different  perspectives  or  of  different  granularity 
of  the  code.  These  displays  are  critical  to  understanding  of 
the  software.  The  retrieved  views  can  be  modified  graphi¬ 
cally  by  adding  or  deleting  of  nodes  or  edges. 

As  noted,  the  ESL-Ada  graph  is  the  basis  for  software 
abstractions  and  documentation.  The  overall  graph  is  parti¬ 
tioned  into  subgraphs  of  hierarchical  software  units.  These 
subgraphs  progressively  portray  the  partitioned  units  of  the 
software  architecture.  Separate  graphs  are  produced  for 
each  hierarchical  software  unit.  Each  graph  for  a  hierarchi¬ 
cal  software  unit  contains  edges/tuples  to  nodes  in  other  hi- 
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erarchical  software  units  with  which  the  unit  interacts.  Dia¬ 
grams  and  text  are  then  generated  as  listed  below.  They 
provide  the  abstractions  of  the  software  that  constitute  the 
software’s  specifications  in  accordance  with  DOD-STD 
2167A  [8]. 

Hierarchical  Decomposition  Diagrams — showing  de¬ 
composition  of  the  overall  software  into  hierarchical  units, 
(based  on  the  scope  tuples) 

Flow  Diagrams — showing  flow  of  data  and  control 
within  and  between  hierarchical  units,  (based  on  the 
memory,  call,  message  and  i/o  tuples) 

Interface  Tables — showing  the  structure  of  inputs  and 
ouQ)uts  of  each  hierarchical  unit,  (based  on  the  scope,  call, 
message  and  i/o  tuples) 

Object/Use  Diagrams — showing  for  hierarchical  units 
where  types  or  generics  are  defined  and  where  they  are 
used,  (based  on  the  type  tuples) 

Context  Diagrams — showing  the  library  units  and 
where  they  are  used,  (based  on  the  context  tuple) 

Coiiunents  Text — showing  the  comments  in  each  hi¬ 
erarchical  unit.  They  are  assumed  to  contain  information  on 
the  hierarchical  unit’s  capabilities. 

These  diagrams  can  be  retrieved  and  displayed  (with 
appropriate  layout)  based  on  queries  that  cite  the  desired 
nodes  and  tuple  types  as  parameters. 

4.  Strategy  for  IVanslating  Concurrency-Re¬ 
lated  Operating  System  Calls 

This  section  defines  the  Ada  entities  that  are  synthe¬ 
sized  to  accomplish  the  protocols  of  Operating  System 
calls.  The  translation  replaces  processes  in  the  source  soft¬ 
ware  with  Ada  tasks  called  functional  tasks.  Additional 
tasks  are  created  for  a  controller  to  execute  Operating  Sys¬ 
tem  calls  and  for  buffering  of  communications  between  oth¬ 
er  tasks.  Procedures  are  added  for  each  Operating  System 
call.  They  are  illustrated  in  Figure  2  and  discussed  below. 
Large  rectangles  in  Figure  2  denote  Ada  tasks;  bubbles  de¬ 
note  task  enuies;  communication  paths  denote  flow  of  mes¬ 
sages  (showing  direction  of  the  call  and  direction  of  the 
message);  columns  of  small  rectangles  (inside  tasks)  de¬ 
note  procedures  for  executing  Operating  System  calls. 

The  interprocess  communication  of  the  source  soft¬ 
ware  Operating  System  is  performed  by  an  Ada  task  called 


controller.  It  is  shown  at  the  top  of  Figure  2.  It  only  per¬ 
forms  the  concurrency  related  Operating  System  calls  that 
are  actually  used  in  the  source  software.  (Operating  Sys¬ 
tems  error  processing  and  I/O  are  not  considered  in  this  pa¬ 
per.  The  conu-oller  plays  the  role  traditionally  played  by  the 
kernel  in  Operating  systems.  That  is,  it  provides  the  basic 
mechanisms  for  supporting  task  creation,  interaction,  and 
termination,  and  for  communicating  with  the  program’s  en¬ 
vironment. 

The  source  Operating  System  also  performs  dispatch¬ 
ing  of  processes  and  uses  a  variety  of  underlying  systems. 
However,  these  services  are  provided  by  the  Ada  compiler 
when  generating  object  code  by  inserting  into  the  code  Op¬ 
erating  System  calls  tailored  for  each  vendor’s  Operating 
System  hardware  and  communications.  These  capabilities 
therefore  are  are  not  included  in  the  translation.  The  sched¬ 
uling  by  the  source  Operating  System  may  differ  from  that 
of  the  target  Ada  Program  but  should  not  affect  the  correct¬ 
ness  of  the  software,  although  it  may  affect  the  timing  (fur¬ 
ther  discussed  in  Section  6.3).  If  the  source  Operating  Sys¬ 
tem  supports  priority  assignments  for  processes,  then 
corresponding  priorities  are  also  generated  for  respective 
Ada  tasks.  If  the  source  language  or  Operating  System  sup¬ 
ports  priorities  for  messages,  then  they  are  included  in  the 
operations  of  the  mailbox  tasks,  as  described  below. 

The  messages  exchanged  between  tasks  are  of  two 
kinds: 

(i)  control  messages  for  interpreting  Operating  System 
calls;  these  messages  are  exchanged  by  the  controller 
task  and  the  functional  tasks. 

(ii)  data  messages  for  communicating  variables  among 
processes  as  specified  in  the  source  program;  these 
messages  are  sent  or  received  by  tasks  that  represent 
source  software  processes  (the  functional  tasks). 

The  Controller  task  contains  in  its  body  a  main  loop 

that: 

(i)  receives  a  control  message  from  other  tasks  (via  the 
controller’s  mailbox  task)  to  perform  the  equivalent 
of  an  Operating  System  call 

(ii)  calls  a  procedure  that  executes  the  protocol  of  the 
Operating  System  call.  The  protocol  may  involve 
sending  control  messages  to  tasks  and  receiving  ac¬ 
knowledgements  of  protocol  steps. 
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Figure  2:  Ada’s  tasks,  procedures,  and  messages  to  implement  OS  Calls 


Thus,  the  Controller  task  is  implemented  as  a  mono¬ 
lithic  kernel  in  the  sense  that  it  can  execute  the  protocol  of  a 
single  Operating  System  call  at  a  time.  This  should  not  be  a 
problem  since  most  of  the  work  of  the  call  is  done  outside 
the  Controller  which  only  routes  messages  and  updates  task 
control  information.  If  it  will  prove  necessary,  we  will  re¬ 
consider  the  design  of  the  Controller  to  become  an  interact¬ 
ing  family  of  tasks,  each  with  a  different  priority,  corre¬ 
sponding  to  the  priority  of  the  callers  and  to  specific 
segments  in  the  execution  of  the  call.  The  declaration  of  a 
controller  task  is  inserted  at  the  beginning  of  the  target  Ada 
concurrent  software.  The  controller  is  dynamically  pro¬ 
vided  with  data  on  each  task  being  created.  This  data  is  sim¬ 
ilar  to  a  Process  Control  Block  of  an  Operating  System.  It  is 
called  Task  Control  Block  (TCB). 

Processes  created  by  Operating  System  calls  in  the 
source  program  are  translated  into  declarations  of  respec¬ 
tive  Ada  tasks,  the  Junctional  tasks.  They  are  illustrated  in 
the  middle  of  Figure  2.  The  declarations  of  functional  tasks 
are  inserted  in  the  places  where  there  are  calls  to  the  Operat¬ 
ing  System  in  the  source  software  to  create  the  respective 
processes.  Each  functional  task  contains  a  main  procedure 
that  corresponds  to  the  sequential  execution  code  of  the  re¬ 
spective  process  in  the  source  program.  In  Ada,  the  sequen¬ 
tial  execution  code  is  contained  in  the  body  of  the  functional 
task.  The  functional  task  body  may  contain  calls  to  proce¬ 
dures  that  send  control  messages  to  the  controller  task,  thus 
causing  execution  of  the  protocol  of  the  respective  Operat¬ 
ing  System  call.  A  functional  task  may  be  assigned  a  prior¬ 
ity,  as  indicated,  by  a  respective  source  software  Operating 
System  call.  The  inserted  declaration  of  functional  tasks 
create  the  tasks  dynamically  and  their  creation  is  reported  to 
the  controller  task. 

An  originating  task  may  call  the  controller  task  to 
execute  an  Operating  System  call  on  a  destination  task.  The 
controller  task  then  sends  the  command  to  the  destination 
task.  Each  functional  task  must  check  periodically  if  it  has  a 
waiting  command  from  the  controller.  If  one  exists,  then  the 
functional  task  must  execute  the  command.  An  example  of 
such  a  command  is  a  call  in  one  functional  task  for  suspend¬ 
ing  or  terminating  another  functional  task.  When  the  des¬ 
tination  functional  task  receives  the  command,  it  suspends 
or  terminates  itself  normally.  The  checks  for  existence  of  a 
waiting  message  add  overhead  to  the  execution  of  these 
commands.  The  required  response  time  to  these  commands 
determines  the  frequency  of  checking  for  such  messages.  It 
must  be  compensated  for  by  using  much  faster  hardware 
with  the  target  Ada  software  than  the  hardware  used  with 
the  source  hardware. 


For  the  reasons  below,  it  was  determined  necessary  to 
have  all  communications  between  functional  tasks  or  be¬ 
tween  the  controller  and  a  functional  task  to  go  via  a  mail¬ 
box  task.  The  mailbox  task  contains  intelligence  to  handle 
the  following: 

(i)  recognizing  and  buffering  variable  length  data  mes¬ 
sages 

(ii)  recognizing  and  buffering  control  messages  and  giv¬ 
ing  top  priority  in  delivery  of  control  messages 

(iii)  delivering  data  messages  in  a  priority  order,  in  accor¬ 
dance  with  the  message  priority  requirements  of  the 
source  software  Operating  System.  Same  priority 
data  messages  are  delivered  in  first-in  first-out  or¬ 
der. 

(iv)  acknowledging  receipt  or  delivery  of  messages  corre¬ 
sponding  to  requirements  of  the  source  software  Op¬ 
erating  System  calls.  Note  that  this  can  support  both 
guaranteed  delivery  of  messages  as  well  as  servicing 
blocked  or  unblocked  conununication  commands. 

(v)  receiving,  interpreting,  and  acknowledging  control 
messages  directed  to  the  mailbox  task  itself.  This  is 
necessary  to  suspend,  continue  or  terminate  a  mail¬ 
box. 

Mailbox  tasks  are  created  dynamically.  They  arc  re¬ 
ported  to  the  controller  for  each  control  and  functional  task. 
A  mailbox  task  is  created  at  the  initialization  of  the  respec¬ 
tive  controller  or  functional  task.  Thus,  there  is  one  mailbox 
task  for  the  controller  task  and  one  for  the  functional  task. 
An  additional  mailbox  task  may  be  created  to  establish  a 
sending  and  receiving  communication  path  between  multi¬ 
ple  processes  in  the  source  software.  This  task  is  also 
created  dynamically  in  the  corresponding  place  where  the 
source  software  communication  path  is  declared  in  the 
source  software.  The  mailbox  tasks  are  also  shown  in  Fig¬ 
ure  2.  Each  mailbox  task  has  an  entry  point  for  calls  of  in¬ 
coming  messages  and  an  entry  point  for  delivery  of  mes¬ 
sages. 

The  use  of  mailbox  tasks  may  at  times  add  an  overhead 
of  as  much  as  double  the  communication  time  between 
tasks.  This  overhead  must  also  be  compensated  for  by  use 
of  hardware  that  is  considerably  faster  than  the  hardware 
used  by  the  source  software. 

The  selection  of  Ada  primitives  and  their  method  of 
synthesis  takes  into  account  minimizing  the  overhead  in 
execution,  and  maintaining,  or  even  improving  the  under- 
standability  of  the  graphic  model  of  the  software.  We  dis¬ 
tinguish  between  two  types  of  primitives:  those  that  are  ge- 
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The  first  pass  consists  of: 


neric  to  many  Operating  Systems,  and  those  that  are 
specific  to  a  selected  Operating  System.  There  are  some 
structures  and  operations  that  are  common  to  a  number  of 
Operating  Systems.  The  translation  of  each  Operating  Sys¬ 
tem  call  requires  composing  its  own  procedures  that  imple¬ 
ment  the  respective  protocol  of  the  call.  Both  the  generic 
code  and  the  Operating  System  call  procedures  are  placed  in 
the  Ada  library  as  discussed  in  Section  5.  The  procedures 
that  execute  Operating  System  calls  are  shown  in  Figure  2, 
inside  the  respective  task  boxes. 

An  Operating  system  call  may  not  be  fully  translatable 
to  Ada,  or  translatable  only  in  a  restricted  way.  For  exam¬ 
ple,  the  Appendix  refers  to  the  Unix  Fork  call  which  in¬ 
volves  creating  a  new  process.  It  consists  of  copying  an 
executable  image,  creating  an  appropriate  task  control 
structure,  and  rescheduling.  This  cannot  be  done  in  general 
directly  in  Ada.  But  it  can  be  done  in  Ada  if  the  code 
executed  in  the  child  thread  consists  of  a  call  to  a  predefined 
pure  procedure,  i.e.  all  data  used  in  the  procedure  is  either 
local  or  an  explicit  parameter  of  the  procedure  (Out,  and  In 
Out  parameters  of  the  procedure  must  have  been  copied  be¬ 
fore  the  procedure  is  called). 

5.  Implementing  Translation  Of  Concurren¬ 
cy-related  Operating  System  Calls 
5.1  A  Two  Pass  Process 

The  translation  of  Operating  System  calls  is  shown  in 
Figure  1  as  a  box  titled  ’’Insertion  of  Ada  tasks  and  messages 
for  concurrency  into  ESL-Ada.”  The  input  of  this  process  is 
the  ESL-Ada  model  of  the  software  obtained  from  translat¬ 
ing  the  source  sequential  code  into  Ada.  It  includes  source 
Operating  System  calls  embedded  in  the  sequential  Ada 
code  in  the  graphic  software  model.  This  concurrency 
translation  process  replaces  the  source  Operating  System 
calls  in  the  ESL-Ada  with  Ada  code. 

The  translator  process  is  illustrated  in  further  detail  in 
Figure  3.  As  shown,  the  ESL-Ada  software  model  is  up¬ 
dated  in  two  passes. 


(i)  scanning  the  ESL-Ada  graphic  model  of  the  soft¬ 
ware  to  find:  (a)  the  concurrency-related  Operating 
System  calls  that  are  used  in  the  source  software,  (b) 
which  calls  create  processes  and  therefore  must  be 
replaced  by  functional  task  declarations,  and  (c)  the 
beginning  and  end  of  the  sequential  thread  of  execu¬ 
tion  code  performed  in  each  of  the  source  processes. 
This  information  is  tabulated  for  use  in  Pass  2. 

(ii)  generating  the  controller  task  which  contains  calls  to 
the  procedures  that  interpret  all  the  Operating  Sys¬ 
tem  calls  that  are  used  in  the  source  software.  As 
noted  above,  the  code  in  the  library  is  for  a  single 
processor.  Only  one  controller  task  exists  for  the 
entire  software.  It  is  generated  in  place  and  not  in¬ 
cluded  in  the  Ada  library.  The  controller  task  decla¬ 
ration  is  generated  in  Pass  1 .  Its  specification  is  in¬ 
serted  at  the  beginning  of  the  ESI^Ada  model  of  the 
software.  It  contains  procedures  for  only  the  Operat¬ 
ing  system  calls  in  source  software. 

(iii)  creating  a  mailbox  task  for  the  controller  task. 

The  second  pass  consists  of: 

(iv)  inserting  instantiations  of  respective  functional  tasks 
and  their  mailbox  tasks  in  place  of  each  Operating 
System  call  that  creates  a  process.  The  inserted  code 
sends  a  control  message  to  the  controller  task,  report¬ 
ing  the  created  task  control  block. 

(v)  inserting  an  instantiation  of  a  mailbox  task  for  each 
Operating  System  call  that  establishes  an  intertask 
data  messages  channel. 

(vi)  Inserting  in-place  calls  to  procedures  that  execute 
every  other  type  of  Operating  System  call. 

5.2  Use  of  Ada  Library 

As  shown  in  Figure  3,  the  creation  of  Ada  tasks  and 

procedures  is  simplified  by  instantiation  of  pre-defined  ob¬ 
jects  in  the  Ada  library.  The  Ada  library  is  given  in  [6]. 
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Figure  3:  Conversion  of  Operating  System  Calls  to  Ada 

I.  Data  declarations  for  use  in; 

mailbox  tasks,  for  receiving,  sending  and  buffering  data 
and  control  messages 
generic  package  of  a  functional  tasks, 
controller  task 

II.  Types  of  procedures  for; 

interpreting  the  protocols  of  the  source  Operating  System  calls  in  the; 
controller  task 
functional  tasks 
mailbox  tasks 

III.  Types  of  tasks  and  generic  package  for; 

mailbox  task 

generic  package  that  contains  a  functional  task 


Figure  4:  Summary  of  the  Ada  Library 
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The  contents  of  the  Ada  library  is  outlined  in  Figure  4. 
It  contains  the  data  declarations  used  in  the  tasks  shown  in 
Figure  2  (I).  Next,  the  library  contains  procedures  for  the 
Operating  System  calls  (II)  .  Finally,  the  Ada  library  con¬ 
tains  a  declaration  of  a  generic  package  for  a  functional  task 
and  a  mailbox  type  (III).  Since  there  may  be  a  large  number 
of  functional  tasks,  the  use  of  the  generic  package  for  the 
functional  task  facilitates  the  software  understandability. 
The  instantiation  of  each  functional  task  in  Pass  2  requires 
providing  the  generic  package  with  parameters  (i)  a  unique 
name  for  the  task  (ii)  the  name  of  the  procedure  that  corre¬ 
sponds  to  the  code  executed  in  the  respective  source  pro¬ 
cess,  and  (iii)  the  names  of  the  procedures  for  interpreting 
the  Operating  System  calls  by  the  respective  functional 
task.  [6]  contains  the  Ada  library. 

The  synthesis  of  Ada  software  for  Operating  System 
calls  is  shown  in  an  example  in  [6].  It  illustrates  the  Pass  1 
and  Pass  2  processes  through  application  of  this  methodolo¬ 
gy  for  a  program  in  C  for  a  Producers/Consumer  application 
using  the  Unix  Operating  System.  This  C  Program  appears 
in  Figure  S.  Figure  0  has  the  correspondent  Ada  Program. 
The  execution  of  this  program  is  also  documented  in  [6]. 

A  briefer  description  of  the  example  in  CAJnix  trans¬ 
lated  into  Ada  is  given  in  the  Appendix  to  this  paper.  It 
shows  both  the  C/Unix  code  input  and  the  Ada  code  output 
of  the  translation  process. 

6.  Future  Research  And  Development 

The  present  paper  is  in  a  sense  a  progress  report.  As 
shown  in  Figure  1,  while  it  reports  on  the  translation  of  con¬ 
current  source  software  into  Ada  for  execution  on  a  single 
processor,  it  leaves  to  future  research  the  problems  of 
checking  and  modifying  the  software  to  attain  indepen¬ 
dence  of  the  logic  on  the  source  hardware  and  distribution  of 
the  calculations.  These  problems  are  summarized  in  this 
section. 

6.1  Distributing  Concurrent  Software  of 
Large  Scale  Applications 

Large-scale  concurrent  software  is  envisaged  as  con¬ 
sisting  of  large  numbers  of  Ada  tasks  divided  among  soft¬ 
ware  units.  Further,  the  translation  from  large-scale  source 
software  is  typically  performed  in  pieces,  producing  one 
translation  unit  at  a  time  (e.g.  similar  to  an  Ada  compilation 
unit).  Frequently  communicating  pairs  of  tasks  should  be 
executed  in  co-located  processors.  Less  frequently,  com¬ 
municating  tasks  may  be  distributed  geographically.  It  is 
contemplated  to  have  a  controller  task  for  each  set  of  co-lo- 
cated  processors.  The  translation  of  distributed  software  is 


envisaged  to  produce  one  controller  task  for  each  set  of  co¬ 
located  processors.  Communication  between  distinct  co¬ 
located  processes  will  be  mediated  by  their  controllers.  Use 
of  Ada-9x  will  facilitate  exchange  of  messages  in  a  geo¬ 
graphically  distributed  environment. 

6.2  Modifying  the  Ada  Software  For  Indepen¬ 
dence  of  the  Hardware 

In  reengineering  concurrent  software  it  is  necessary  to 
consider  the  possibility  that  the  programmer  of  the  source 
software  has  relied  on  knowledge  of  relative  computation 
delays  and  omitted  some  synchronization  statements  based 
on  knowledge  of  the  speed  of  the  processors  and  the  sched¬ 
uling  discipline  of  the  Operating  System  used  at  that  time. 
Therefore,  some  synchronization  statements  may  not  have 
been  made  explicitly  in  the  software.  Such  explicit  syn¬ 
chronization  commands  must  then  be  added  to  the  software 
in  order  to  attain  logic-wise  independence  of  the  speed  of 
the  processors  used. 

Finding  and  inserting  missing  synchronizations  re¬ 
quires  the  availability  of  specifications  and  an  understand¬ 
ing  of  the  software.  We  envisage  employing  an  interactive 
man-machine  procedure.  The  user  may  gain  knowledge  of 
the  software  requirements  oy  examining  the  display  of  the 
graphic  model  of  the  software.  In  particular  the  data  flow 
graph  may  be  useful  (It  was  described  in  Section  3.).  It 
shows  the  flow  of  data  and  control  between  hierarchical 
software  units,  such  as  tasks,  procedures,  or  functions,  as 
well  as  the  shared  global  data  structures,  files,  inputs,  and 
outputs.  The  data  flow  graph  will  then  show  in  detail  the 
interactions  of  tasks  with  specific  global  data  structures. 
The  graphs  will  be  examined  in  detail.  The  user  will  be  able 
to  check  the  existence,  or  lack,  of  the  needed  synchronizing 
messages  to  assure  mutual  exclusion.  Missing  synchroniza¬ 
tion  commands  can  then  be  inserted  in  the  graphic  model  of 
the  software. 

6.3  Determining  Required  Processor  and 
Communication  Capacity  For  Satisfying 
Real-Time  Deadlines. 

The  objective  is  to  determine  the  required  capacities 
of  network  computing  and  communication  resources  'hat 
must  be  allocated  to  the  concurrent  software  in  order  to 
meet  the  real-time  deadline  requirements.  The  proposed 
approach  consists  of  a  simulation  of  the  schedule  based  on 
the  model  of  the  software  [3,10]. 

The  timing  requirements  are  typically  not  specified  in 
the  source  code.  They  must  be  obtained  by  the  human  user 
of  the  software  reengineering  facility  from  the  software 
specification.  An  interactive  procedure  is  envisaged  here  as 
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well.  The  user  will  utilize  the  display  of  the  data  flow  dia¬ 
gram,  on  a  selected  level  of  detail.  The  timing  information 
may  be  added  to  the  graph  by  identifying  paths  from  starting 
to  ending  nodes  and  the  required  maximum  delay.  The  data 
flow  graph,  with  the  timing  requirements,  is  then  used  as  in¬ 
put  to  a  simulation  system  which  computes  the  delay  in  a 
path  and  compares  it  with  the  required  maximum/minimum 
delays.  The  computation  may  be  only  an  approximation  if  it 
is  based  on  the  execution  times  of  Ada  instructions.  The 
simulation  must  be  augmented  with  testing. 

The  simulation  may  also  be  used  iteratively  in  search¬ 
ing  systematically  for  near  optimal  allocation  of  computing 
resources  and  communications.  It  may  also  find  possibili¬ 
ties  of  bottlenecks  [3]. 
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APPENDIX 


PRODUCER-CONSUMER  C/UNIX  EXAMPLE  EXPLANATION 


This  is  a  C/Unix  program  example  for  the  Producer /Consumer  problem.  This 
program  consists  of  three  entities:  main,  producer,  and  consumer.  At  runtime, 
there  are  three  processes  running  concurrently  as  follows: 

18.  A  process  executing  the  original  main  program  that  creates  a  PIPE, 
and  two  child  processes. 

19.  A  process  executing  a  copy  of  the  main  prograim  in  which  a  PRODUCER 
function  is  being  called. 

20.  3)  A  process  executing  a  copy  of  the  main  program  in  which  a  CONSUMER 
fiinction  is  being  called. 

The  functions  for  these  processes  are  called  MAIN,  PRODUCER,  and  CONSUMER 
processes .  The  runtime  structure  of  these  entities  and  their  relationships  are 
are  shown  as  below.  It  includes  PIPE  and  READ/WRITE  operations  on  it. 


In  the  following,  each  function  and  its  respective  operations  are  described. 

1 .  MAIN  :  It  creates  a  PIPE  and  processes  that  execute  PRODUCER  and  CON¬ 
SUMER  functions.  It  passes  a  WRITE  pointer  to  PIPE  to  PRODUCER  and  a 
READ  pointer  to  CONSUMER  for  communication  of  message  between  PRODUC¬ 
ER  and  CONSUMER.  Once  both  processes  are  active,  it  waits  for  ter¬ 
mination  of  both  processes.  After  termination,  it  closes  the  PIPE  and 
terminates  its  execution. 

2.  PRODUCER:  It  is  executed  in  a  copy  of  the  main()  function.  In  execu¬ 
tion,  a  pointer  to  a  PIPE  is  passed  as  a  parameter  to  write  mes¬ 
sages.  It  writes  on  the  PIPE  a  message  which  has  been  gotten  from  the 
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input,  in  each  iteration  of  a  loop.  This  loop  continues  until  FLAG 
condition  is  not  met. 

3.  CONSUMER:  It  is  executed  in  a  copy  of  mainO  function.  In  execu¬ 
tion,  a  pointer  to  a  PIPE  is  passed  as  a  parameter  to  read  messages. 
It  reads  from  a  PIPE  a  message  which  is  to  be  written  to  the  output, 
in  each  iteration  of  a  loop.  This  loop  continues  until  FLAG  condition 
is  not  met. 

4.  PIPE  :  It  is  an  IPC  mechanism  used  in  UNIX  OS.  This  is  one-way  commu¬ 
nication  mechanism,  with  two  pointers  for  READ  and  WRITE.  It  is 
created  by  MAIN,  and  is  passed  to  PRODUCER  and  CONSUMER  for  commu¬ 
nication  between  them. 

These  entities  are  described  by  comments  in  the  following  C/Unix 

PRODUCER-CONSUMER  C/UNIX  EXAMPLE  CODE 


void  PRODUCER(BUF)  /•  CONSUMER  FUNCTION  •/ 

int  BUF;  /*  WRITE  PIPE  */ 

{ 

char  MSG[1];  /•  Message  to  be  written  on  PIPE  •/ 

int  FLAG  =  1;  /*  Flag  for  wile  iteration  */ 

while  (FLAG)  /*  Repeat  until  false  */ 

{ 

gets (MSG);  /•  Input  from  Standard  ♦/ 

write(BUF,MSG,sizeof (MSG) ) ;  /•  Write  Msg  on  PIPE  •/ 

/•  reset  FLAG  */  /•  Reset  flag  •/ 

) 

exit(  );  /•  Exit  from  execution  */ 

) 

void  CONSUMER(BUF)  /*  PRODUCER  FUNCTION  */ 

int  BUF;  /•  READ  PIPE  »/ 

{ 

char  MSG[1]  ;  /•  Message  to  h)e  read  from  PIPE  */ 

int  FLAG  =1;  /*  Flag  for  while  iteration  */ 

while  (FLAG)  /*  Repeat  until  false  */ 

{ 

read(BUF,MSG,sizeof (MSG) ) ;  /•  Read  MSG  from  PIPE  •/ 

puts (MSG);  /*  Output  to  Standard  */ 

/*  reset  FLAG  */  /*  Reset  flag  */ 

) 

exit(  );  /•  Exit  from  execution  */ 

) 

main(  )  /*  MAIN  FUNCTION  */ 

( 

int  BUF(2];  /*  READ/WRITE  file  descriptor  of  PIPE*/ 

pipe(BUF);  /•  Open  a  PIPE  */ 

if  (!for)c(  ))  /*  Create  a  child  process  */ 

{ 

CONSUMER (BUF (0) ) ;  /‘calling  CONSUME 

) 
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if  (!fork(  )) 

{ 

PRODUCER ( BUF [ 1 ] )  , 

) 

wait (  J ; 
wait (  ) ; 
close (BUF) ; 


/*  Create  a  child  process  */ 

/•  calling  PRODUCER*/ 


/*  Wait  for  a  child  terminated  */ 
/*  Wait  for  a  child  terminated  */ 
/*  Close  a  PIPE  */ 


STRUCTURE  OF  TRANSLATED  ADA  PROGRAM 

The  source  software  is  translated  into  a  procedure.  One  of  the  reason  for  this 
is  to  modularize  each  respective  source  software.  This  increases  readability 
and  understandability  of  the  generated  Ada  target  code. 

The  structure  of  the  translated  Ada  code  is  shown  below  Figure.  It  consists  of 
tasks:  a  controller  tasks  with  mail  box,  functional  tasks  with  mail  boxes,  and 
a  pipe. 


Note  that  the  reason  for  C_MAIN  to  be  instantiated  as  a  task  is  due  to  UNIX 
semantics  that  MAIN  is  a  process. 


TRANSLATED  ADA  PROGRAM 

with  LIBRARY_PACK;  use  LIBRARY_PACK; 
with  TEXT_IO;  use  TEXT_IO; 

with  Uncheclced_Conversion; 

procedure  ADA_MAIN  is 

— <Other  Ada  Code  for  Controller  and  Mail  Box  Tasks> 


PRODUCER 


172 


procedure  Producer (CtnlHBoxPtr  :  in  MAIL_B0X_P; 

FTMBoxPtr  :  in  MArL_BOX_P; 

ArgList  :  in  MESSAGE_TAIL_T) 
is 

—  Local  Variables : 

MSG  :  CHARACTER: 

FLAG  :  INTEGER  :=  1; 

—  Conversion  Dependent  Local  Variables : 

LocalArgList  :  PR0DUCER_ARG_T:=Byte_Str_T_2_Producer_Arg_T (ArgList) ; 
ByteStream  :  MESSAGE_TAIL_T; 

N  :  INTEGER  :=  MSG' SIZE; 

Rvalue  :  INTEGER; 

begin 


while  (FLAG  =  1) 
loop 

—  For  demonstration: 

put(*Enter  a  character  for  PRODUCER(To  terminate,  enter  'T'):'); 
get (MSG) : 

—  (C/UNIX)  write (BUF, MSG, sizeof (MSG)); 


ByteStream  :=  MSG_2_Byte_Str_T(MSG) ; 

UNIX_Write (FTMBoxPtr, LocalArgList . BUF, ByteStream, N, RValue ) ; 


—  I  The  'write*  OS  call  is  translated  into  a  call  to  a  procedure 
—  I  'UNIX_Write'  defined  in  the  library.  This  procedure 
—  I  generates  a  DATA  message  to  a  pipe  mail  box  by  calling  an 

—  1  WRITE  entry.  In  writing,  it  waits  for  an  ac)aiowledgement  from 
—  I  the  PIPE  based  on  bloc)cing  information.  In  return,  it  receives 
—  I  a  number  of  bytes  written  on  the  PIPE. 

—  /*  reset  FLAG  */ 

end  loop: 

—  (C/UNIX)  exit(  ); 


UNIX_Exit(QnlMBoxPtr.FrMBorPtr); 


The  'exit'  OS  call  is  translated  into  a  call  to 
'UNIX_Exit'  procedure  defined  in  the  library.  This 
procedure  generates  a  CONTROL  message  for  this  OS  call  and 
passes  it  to  Controller,  when  Controller  receives  this  message, 
it  calls  a  procedure  'SYS_EXIT_Routine'  procedure  defined 
in  the  library.  This  procedure  will  perform  the  protocol  of 
'exit'  OS  call  in  the  source  software  OS. 


end  Producer; 


CONSUMER 


procedure  Consumer (CtnlMBoxPtr 
FTMBoxPtr 
ArgList 


in  MAIL_BOX_P; 

:  in  MAIL_BOX_P; 

:  in  MESSAGE_TAIL_T) 
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is 

--  Local  Variables: 

MSG  :  CHARACTER; 

FLAG  :  INTEGER  ;=  1 ; 

--  Conversion  Dependent  Local  Variables: 

LocalArgList  :  CONSUMER_ARG_T:  =Byte_Str_T_2_Consuiner_Arg_T  (ArgList )  ; 
ByteStream  :  MESSAGE_TAIL_T; 

N  :  INTEGER  :=  MSG 'SIZE; 

Rvalue  :  INTEGER; 

begin 

while  (FLAG  =  1) 
loop 

—  [C/UNIX]  read(BUF,MSG,sizeof (MSG) ) ; 


—  UNIX_Read(FrMBoxPtr,LocalArgU$t.BUF,ByteSticam,N,RValue); 

—  MSG  :=  Byte_Str_T_2_MSG(ByteStream); 


—  I  The  "read*  OS  call  is  translated  into  a  call  to  a  procedure 
—  I  'UNIX^Read*  defined  in  the  library.  This  procedure 
—  I  gets  a  DATA  message  from  a  PIPE  mail  box  by  calling  an  READ 
— (  entry.  In  reading,  it  checks  for  the  right  message  by  the 
—  I  size  and  type  of  message. 

put (MSG) ; 

—  /•  reset  FLAG  •/ 

end  loop; 

—  [C/UNIX]  exit(  ); 


UNIX_Exit(CtnlMBoxPtr,FrMBoxPtr); 


The  'exit'  OS  call  is  translated  into  a  call  to 
'UNIX_Exit'  procedure  defined  in  the  library.  This 
procedure  generates  a  CONTROL  message  for  this  OS  call  and 
passes  it  to  Controller,  when  Controller  receives  this  message, 
it  calls  a  procedure  'SYS_EXIT_Routine'  procedure  defined 
in  the  library.  This  procedure  will  perform  the  protocol  of 
'exit'  OS  call  in  the  source  software  OS. 


end  Consumer; 


C_MAIN 


procedure  C_Main (CtnlMBoxPtr  :  in  MAIL_BOX_P; 

FTMBoxPtr  :  in  MAIL_BOX_P; 

ArgList  :  in  MESSAGE_TAIL_T) 

is 

--  Local  Variables: 
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BUF 


:  MAIL_BOX_P; 


--  Conversion 

ProducerArg  : 

ConsumerArg  : 

ByteStream 

FTTskPtr 

Rvalue 

Name 


Dependent  Local  Variables: 
PRODUCER_ARG_T ; 
CONSUMER_ARG_T ; 
MESSAGE_TAIL_T; 
STANDARD_TASK_P ; 

INTEGER ; 

NAME_T; 


Consumer_P  :  STANDARD_TASK_P ; 
Producer_P  :  STANDARD_TASK_P; 


—  Instantiation  of  CONSUMER  package: 
package  CONSUMER_PACK  is  new  FUNCTIONAL_TASK_PACK 
(Task_Body_Procedure  =>  Consumer) : 

—  Instantiation  of  PRODUCER  package: 
package  PRODUCER_PACK  is  new  FUNCTIONAL_TASK_PACK 
(Task_Body_Procedure  =>  Producer) ; 

Isegin 


—  pipe (BUF) ; 


—  UNIX_Pipe(CtnlMBoxPtr,FTMBoxPtr, BUF, Rvalue) ; 


—  I  The  'pipe*  OS  call  is  translated  into  an  instantiation  of 

—  I  mail  box.  In  instantiation,  it  makes  a  call  to 

—  I  *UNIX_Pipe*  procedure  defined  in  the  library  to 

—  I  inform  Controller  about  this  new  mail  box  task  named  PIPE. 

—  I  This  procedure  generates  a  CONTROL  message  for  new  task  and 

—  I  passes  it  to  Controller,  when  Controller  receives  this  message, 

—  I  it  calls  a  procedure  'SYS_PlPE_Routine'  procedure  defined 
— (  in  the  library.  This  procedure  creates  a  new  TCB  for  this 

—  I  mail  box. 


(C/UNIX]  if  (IforkO)  {  PRODUCER  (BUF  ( 1)  ) 


=  BUF; 

=  Producer_Arg_T_2_Byte_Str_T( ProducerArg) ; 

=  Get_Task_Name; 

Producer_P  :=  PRODUCER_PACK .Generate_Functional_Task(Name, FTMBoxPtr , 
CtnlMBoxPtr, ByteStream)  ; 


rr»^aucerArg .  our 
B-  teStream 
Name 


—  I  The  'fork'  OS  call  is  translated  into  a  call  to  a  function  in 

—  I  PRODUCER_PACK  package,  which  is  an  instantiation  of 

—  I  package  called  'FUNCTIONAL_TASK_PACK' .  This  package  is  defined 

—  I  in  the  library.  This  instantiation  requires  the  body  of 

—  I  PRODUCER  procedure  to  be  passed  as  a  parameter.  A  call  to  a 

—  I  'Generate_Functional_Task'  function  in  this  instantiated 

—  I  pac)cage  requires  a  set  of  proper  parameters  co  Ije  passed. 

—  I  These  parameters  are  used  to  generate  a  desired  functional  task. 

—  1  These  are  1)  name  of  child  task  being  created,  2)  a  pointer  to 

—  I  this  task  creating  a  new  child  task,  3)  a  pointer  to  the  CONTROLLER 

—  I  mail  box,  and  finally  4)  a  list  of  paramete-s  for  this  PRODUCER 

—  I  procedure  in  a  stream  of  bytes.  In  instantiation,  this  newly 

--|  created  child  task  generates  its  own  mail  box,  informs  a  controller 

--|  about  its  own  task  and  mail  box,  and  finally  calls  the  body 

—  I  procedure  of  its  own.  When  Controller  receives  messages  al)out  these 
--|  new  tasks,  calls  'NEW_TCB_Routine'  procedure,  in  library,  to  create 
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-- 1  a  new  TCB  for  this  functional  task  and  mail.  This  TCB  is  used 
-- I  to  inform  parent  task  about  its  child  task. 


--  (C/UNIX)  if  (!fork(  ))  (CONSUMER (BUF (0 1 );  ) 


Name  :=  Get_Task_Name; 

Consumer Arg .BUF  : =  BUF ; 

ByteStream  :=  Consumer_Arg_T_2_Byte_Str_T (Consumer Arg) ; 

Consumer_P  : =  CONSUMER_PACK.Generate_Functional_Task(Name, FTMBoxPtr , 
CtnlMBoxPtr, ByteStream) ; 


The  'fork*  OS  call  is  translated  into  a  call  to  a  function  in 
CONSUMER_PACK  pacicage,  which  is  an  instantiation  of 
pac)cage  called  •FUNCTIONAL_TASK_PACK' .  This  package  is  defined 
in  the  library.  This  instantiation  requires  the  )oody  of 
CONSUMER  procedure  to  be  passed  as  a  parameter.  A  call  to  a 
•Genera te_Functional_Task*  function  in  this  instamtiated 
package  requires  a  set  of  proper  parameters  to  be  passed. 

These  parameters  are  used  to  generate  a  desired  functional  task. 
These  are  1)  name  of  child  task  being  created,  2)  a  pointer  to 
this  task  creating  a  new  child  task,  3)  a  pointer  to  the  CONTROLLER 
mail  box,  and  finally  4)  a  list  of  parameters  for  this  PRODUCER 
procedure  in  a  stream  of  bytes .  In  instantiation,  this  newly 
created  child  task  generates  its  own  mail  box,  informs  a  controller 
about  its  own  task  and  mail  box,  and  finally  calls  the  body 
procedure  of  its  own.  When  Controller  receives  messages  about  these 
new  tasks,  calls  *NEW_TCB_Routine'  procedure,  in  library,  to  create 
a  new  TCB  for  this  functional  task  and  mail.  This  TCB  is  used 
to  inform  parent  task  about  its  child  task. 


—  (C/UNIXl  wait(  ) ; 


UNIX_Wait (CtnlMBoxPtr, FTMBoxPtr, NULL, Rvalue) ; 


The  'wait*  OS  call  is  translated  into  a  call  to 
•UNIX_Wait*  procedure  defined  in  the  library.  This 
procedure  generates  a  CONTROL  message  for  this  OS  call  and 
passes  it  to  Controller,  when  Controller  receives  this  message, 
it  calls  a  procedure  'SYS_WAIT_Routine*  procedure  defined 
in  the  library.  This  procedure  will  perform  the  protocol  of 
•wait*  OS  call  in  the  source  software  OS. 


[C/UNIX]  waitO; 


UNIX_Wait (CtnlMBoxPtr , FTMBoxPtr, NULL, Rvalue) ; 


The  'wait*  OS  call  is  translated  into  a  call  to 
•UNIX_Wait'  procedure  defined  in  the  library.  This 
procedure  generates  a  CONTROL  message  for  this  OS  call  and 
passes  it  to  Controller,  when  Controller  receives  this  message, 
it  calls  a  procedure  •SYS_WAIT_Routine'  procedure  defined 
in  the  library.  This  procedure  will  perform  the  protocol  of 
•wait*  OS  call  in  the  source  software  OS. 


—  [C/UNIX]  close (BUF (11); 


UNIX_Close (CtnlMBoxPtr, FTMBoxPtr, BUF, WRITE_PIPE, RValue) ; 


—  I  The  'close'  OS  call  is  translated  into  a  call  to  'UNIX_Close' 
--(  procedure  defined  in  the  library.  This  procedure  generates  a 
--|  CONTROL  message  for  this  OS  call  and  passes  it  to  Controller. 
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—  I  when  Controller  receives  this  message,  it  calls  a  procedure 

—  I  *SYS_CLOSE_Routine'  procedure  defined  in  the  library.  This 

—  I  procedure  will  perform  the  protocol  of  'close'  OS  call  in  the 

—  I  source  software  OS. 

—  (C/UNIX)  close(BUF(0) ) ; 


UNIX_Close (CtnlMBoxPtr , FTMBoxPtr , BUF, READ_PIPE, RValue) ; 


—  I  The  'close*  OS  call  is  translated  into  a  call  to  'UNIX_Close' 

—  I  procedure  defined  in  the  library.  This  procedure  generates  a 

—  I  CONTROL  message  for  this  OS  call  and  passes  it  to  Controller. 

—  I  when  Controller  receives  this  message,  it  calls  a  procedure 

—  I  'SYS_CLOSE_Routine'  procedure  defined  in  the  library.  This 

—  I  procedure  will  perform  the  protocol  of  'close'  OS  call  in  the 

—  I  source  software  OS. 

end  C_>Iain; 

—  Instantiation  of  C_MAIN  pac)cage: 


pacicage  C_MAIN_PACK  is  new  FUNCTIONAL_TASK_PACK 

<Task_Body_Procedure  =>  C_MAIN) 

begin 

—  ADA^MAIN  declares  C_MAIN  as  a  FUNCTIONAL  TASK  by  calling  a 

—  Generate_Functional_Task  functional  defined  in  C_MAIN_PACK 

—  package, 
declare 

Name  :  NAME.T; 

ByteStream  :  MESSAGE_TAIL_T; 

MainArg  :  MAIN_ARG_Ti 

C_Hain_P  :  STANDARD_TASK_P ; 

begin 

Name  :=  Get_Ta8k_Name; 

ByteStream  :=  Main.^g_T_2_Byte_Str_T  (MainArg)  ; 

C_Main_P  :=  C_MAIN_PACK.Generate_Functional_Task(Name, 

ControllerMailBoxPtr,ControllerMailBoxPtr, ByteStream) ; 

end; 

NULL; 

end  ADA_MAIN; 
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Abstract 

This  paper  describes  the  Software  Migration  and 
Reengineering  (SMR)  project  being  sponsored  by  the 
Embedded  Computing  Institute.  The  project  is  nearing  the 
end  of  the  first  phase  of  a  four  phase  eiTort.  The  purpose  of 
the  initial  phase  was  to  define  the  technical  approach, 
based  on  the  expectations  of  the  customer,  investigation  of 
previous  approaches,  and  the  availability  of  existing  tools. 
The  paper  discusses  information  uncovered  and  describes 
the  technical  approach  that  was  developed  based  on  this 
information.  The  unique  contributions  of  this  project  are 
expected  to  be  in  the  areas  of:  recovery  of  design  informa¬ 
tion  from  CMS-2M  and  AYK-14  assembler;  investigation 
of  representations  for  the  desired  state  of  the  system;  inter¬ 
active  capture,  display,  and  modification  of  the  recovered 
information  by  the  user;  and  integration  of  the  SMR 
toolset  with  the  F/A-18  Software  Engineering  Environ¬ 
ment  via  population  of  a  mainstream,  conunercially  avail¬ 
able  CASE  tool. 

1.  Introduction 

The  United  States  military  is  the  world  power  of  today; 
however,  post-Cold  War  realities  have  shifted  American 
policy  from  one  of  military  buildup  to  one  of  extreme 
downsizing.  Part  of  the  originally  successful  buildup 
included  the  development  and  acquisition  of  highly  capa¬ 
ble  weapon  systems.  Although  the  current  downsizing 
effort  may  save  financial  resources,  it  decreases  or  halts 
weapon  system  procurement,  thus  promoting  a  possible 
drop  in  America’s  technological  competitive  edge.  The 
U.S.  military  seeks  to  solve  this  dilemma  by  increasing 
current  weapon  system  capabilities  with  key  technological 
advances. 

A  specific  example  of  this  investment  is  the  Navy’s  F/ 

A- 18  platform.  While  the  operational  capability  of  this 
particular  weapon  system  was  being  fine-tuned  to  meet  the 
expectations  of  those  whose  lives  depend  upon  it,  other 
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technological  advances  have  occurred.  Some  of  these 
technological  advances  (e.g.,  improved  sensors,  multisen¬ 
sor  fusion)  wjIT  translate  directly  into  increased  opera¬ 
tional  capability,  while  others  (e.g.,  advanced  software 
techniques,  Ada,  32-bit  microprocessors)  will  translate 
indirectly  into  increased  operational  capability  because 
they  contribute  to  lessening  maintenance  expenditures.  A 
key  issue  is  how  to  take  advantage  of  these  technological 
advances,  maintain  the  current  operational  capability,  and 
prepare  to  deliver  enhanced  operational  capabilities  into 
the  next  decade. 

Recognizing  these  advances,  the  U.S.  Navy  seeks  to 
upgrade  avionics  systems  from  existing  hardware  plat¬ 
forms  to  those  that  better  support  the  Ada  language. 
Because  existing  software  is  exhibiting  limitations  with 
respect  to  performance,  maintainability,  portability,  and 
scalability,  there  is  a  need  for  effective  and  thorough  soft¬ 
ware  migration  and  re-engineering  mechanisms.  With  the 
current  emphasis  on  cost  and  risk  management,  the  solu¬ 
tion  to  this  problem  must  also  support  an  incremental 
approach. 

The  Software  Migration  and  Reengineering  (SMR) 
project  was  established  by  the  Embedded  Computing 
Institute  (ECI)  located  at  the  Naval  Air  Weapons  Center 
(NAWC)  in  China  Lake,  California.  It  is  a  cooperative 
research  and  development  effort  involving  Computing 
Devices  International,  Microelecuonics  and  Computer 
Technology  Corporation  (MCC),  and  the  Embedded  Com¬ 
puting  Institute. 

1.1.  Current  System  State 

This  subsection  discusses  the  current  state  of  the  F/A- 
18  system.  The  major  points  are:  system  level,  including 
sensors,  buses  and  mission  computers;  software,  including 
a  brief  introduction  into  its  structure;  documentation,  high¬ 
lighting  differences  between  existing  documentation  and 
that  required  by  recent  development  standards;  and  main- 
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tenance  concerns,  including  the  existence  of  several  ver¬ 
sions  of  the  F/A-18  operational  software. 

The  entire  F/A-18  system  is  composed  of  a  number  of 
subsystems,  including  a  variety  of  sensors  and  two  mis¬ 
sion  computers.  These  computers  are  interconnected  via 
several  MIL-STD-1553  buses.  The  SMR  project  is  focus¬ 
ing  on  the  reengineering  and  migration  of  the  software 
residing  in  the  mission  computers;  therefore,  there  will  be 
no  further  discussion  on  the  sensors. 

The  current  mission  computers  are  reaching  the  limits 
of  their  capacity,  and  the  need  to  employ  multiple  MIL- 
STD-15S3  buses  is  an  indication  that  they  too  may  reach¬ 
ing  their  capacity  limits.  However,  it  is  important  to  recog¬ 
nize  that  redundant  computers  and  busses  may  be  present 
because  of  survivability  requirements  (e.g.,  fault  toler¬ 
ance).  Fault  tolerance  may  be  utilized  to  compensate  for 
hardware  failures,  or  battle  damage. 

The  existing  software  is  implemented  in  AYK-14 
assembly  language  with  some  CMS-2M.  The  current  oper¬ 
ational  software  possesses  existing  system  properties  (as 
defined  in  [2]);  functionality,  performance,  and  accuracy. 
We  consider  these  properties  as  fundamental  properties, 
which  must  be  preserved  by  any  reengineering  approach. 
These  properties  are  a  result  of: 

•  Many  years  of  engineering  effort. 

•  Extensive  interaction  with  the  end  users;  that  is, 
pilots  of  the  F/A-18. 

•  Use  of  the  system  in  the  actual  deployed  environ¬ 
ment;  for  example.  Desert  Storm. 

The  existing  “architecture”  of  the  software  is  domi¬ 
nated  by  global  data  (expressed  as  CMS-2M  SYS-DDs), 
and  functionally  oriented  modules  (expressed  as  CMS-2M 
subroutines).  Although  the  software  is  laid  out  in  terms  of 
CMS-2M  constructs,  almost  all  of  the  actual  code  is 
implemented  in  AYK-14  assembler  using  the  direct  fea¬ 
tures  of  CMS-2M.  When  an  assembler  language  module 
wishes  to  manipulate  some  global  data,  the  references  are 
hard  coded  in  the  assembler  modules. 

This  type  of  architecture,  in  addition  to  being  difficult 
to  understand,  makes  seemingly  simple  software  changes 
extremely  difficult.  For  example,  changes  in  the  represen¬ 
tation  of  a  particular  data  item  can  require  changes  in  the 
code  of  modules  throughout  the  system.  Another  example 
is  changes  in  computation  may  also  require  modifications 
to  several  modules.  It  is  important  to  recognize  that  this 
style  of  architecture  is  typical  of  legacy  systems,  and  was 
required  to  achieve  the  performance  that  was  required  by 
the  system. 


It  is  unlikely  that  all  of  the  kinds  of  information  that 
accompany  a  deployed  system  can  be  extracted  from 
source  code  alone.  The  next  logical  step  is  to  look  for  doc¬ 
umentation  of  the  existing  system.  The  existing  documen¬ 
tation  for  the  F/A-18  is  a  mixture  of  MIL-STD-1679 
documents,  and  documents  aeated  by  the  F/A-18  commu¬ 
nity  to  meet  specific  operational  needs.  The  existing  docu¬ 
mentation  is  for  the  most  part  complete  and  consistent; 
however,  there  are  no  formal  software  requirements  or 
software  design  documents  in  MIL-STD-1679. 

Last  it  should  be  noted  that  the  operational  software  for 
the  F/A-18  has  been  in  existence  for  a  number  of  years.  As 
is  normal  for  deployed  software  with  a  long  lifetime,  sev¬ 
eral  versions  of  the  operational  programs  are  in  existence 
and  must  be  maintained.  Sources  of  variation  include: 

•  Differences  in  aircraft  configurations. 

•  Differences  in  mission  requirements. 

•  Different  customers. 

As  is  evident  from  the  preceeding  discussion,  the  “cur¬ 
rent  state”  of  the  existing  system  is  conceptually  very 
large,  extremely  complicated,  more  than  just  source  code, 
and  some  of  the  representations  are  not  amenable  to  auto¬ 
mated  understanding. 

1.2.  Desired  System  State 

It  is  difficult  to  describe  what  the  desired  system  state 
should  be  with  a  high  level  of  confidence.  Many  factors 
contribute  to  this  difficulty.  For  example: 

•  What  point  in  time  should  the  desired  state  be 
achieved? 

•  What  constitutes  an  acceptable  desired  state? 

•  Is  it  standards  based? 

Instead  of  a  definitive  answer  regarding  the  structure  of 
the  desired  state  one  is  usually  presented  with  “require¬ 
ments”  like  the  following: 

•  Retarget  the  software  from  16-bit  target  hardware  to 
32-  or  64-bit  target  hardware. 

•  Move  that  software  to  a  more  modem  language  (i.e., 
Ada). 

•  Enable  more  effective  software  maintenance. 

•  Lower  cost. 

An  important  part  of  any  systems  reengineering  effort 
is  to  create  a  definition  of  the  desired  state  that  meets  these 
kinds  of  requirements.  The  concept  is  illustrated  in  Figure 
1,  and  the  remainder  of  this  section  discusses  several,  but 
probably  not  all,  completed  or  ongoing  industry  efforts 
that  had  an  inlluence  on  the  concept.  This  concept  has 
been  extremely  useful  in  guiding  the  search  for  widely 
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accepted  approaches  to  more  detailed  dcnnitioii  of  tlie 
desired  system  state. 
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Figure  1.  A  Layered  View  of  the  Desired  State 

The  software  engineering  community  at  large  has  iden¬ 
tified  other  desirable  properties  of  a  software  system. 
These  properties,  which  are  also  nicely  summarized  in  [2], 
reflect  software  engineering  principles,  facilitate  continu¬ 
ous  change,  and  attempt  to  capture  existing  system  know 
how. 


Examples  of  these  continuous  change  properties  are: 
modularity,  standard  interfaces,  virtual  machine  abstrac¬ 
tions,  localization  of  information  about  the  application 
domain  and  implementation  technology,  parameterization, 
use  of  Commercial  Off  the  Shelf  (COTS),  and  reuse. 

Examples  of  software  engineering  properties  are:  porta¬ 
bility,  structure,  readability,  testability,  data  independence, 
documented  system  understanding,  openness,  interopera¬ 
bility,  and  seamless  integration. 

Examples  of  existing  system  know-how  properties  are: 
domain  models,  domain-independent  software  architec¬ 
ture  principles,  domain-specific  architectures,  and  adapt¬ 
able  components. 

Many  of  these  principles  are  well  understood  by  some 
software  engineers.  Several  observations  can  be  made 
about  these  properties.  First,  the  properties  arc  not  mutu¬ 
ally  exclusive.  For  example,  modularity,  virtual  machine 
abstractions,  localization  of  information,  structure,  and 
domain  independent  software  architecture  principles  all 
hint  at  the  same  idea.  Second,  some  of  the  properties  are 
not  the  result  of  a  programming  language  per  se.  For 
example,  use  of  high  level  languages  makes  software  more 
portable  and  more  readable  than  the  use  of  assembly  lan¬ 
guages.  Third,  some  of  these  properties  may  be  contradic¬ 


tory  to  the  fundamental  properties.  For  instance, 
abstractions  may  degrade  performance.  Finally,  .some  of 
these  properties  may  be  in  conllici  with  each  other.  One 
example  of  potential  conflict  is  the  ability  to  achieve 
seamless  integration  while  employing  COTS  products  or 
reusing  existing  software.  Based  on  the  foregoing  discus¬ 
sion,  it  is  clear  that  these  principles  need  more  explicit  def¬ 
inition. 

Fortunately,  there  are  a  number  of  recently  completed 
or  currently  underway  efforts  which  address  some  of  these 
issues.  In  the  spirit  of  reuse,  we  are  attempting  to  utilize 
the  results  of  these  efforts  where  feasible.  Examples  of 
these  efforts  are: 

•  The  Navy’s  Next  Generation  Computing  Resources 
(NGCR)  Operating  System  Interface  Specification 
[27].  TheOSSWG  has  adopted  the  IEEE’s  Portable 
Operating  System  Interface  Specification  (POSIX) 
[22-26]  as  the  operating  system  interface  specifica¬ 
tion. 

•  The  Domain  Specific  Software  Architecture  (DSS  A) 
program  [5]. 

•  The  Ada  9X  effort  [18]. 

•  The  SEI’s  Rate  Monotonic  Analysis  (RMA)  and 
Distributed  Real-Tune  System  Design  Efforts  [16, 
17]. 

•  The  SEI’s  Generic  Avionics  System  Specification 
effort  [15]. 

•  Software  requirements  and  design  methods  like  the 
Ada-based  Design  Approach  for  Real-Time  Systems 
(AD ARTS)  [8]. 

Each  of  these  efforts  conuibute  some  insight  into  what 
characteristics  a  system  and  software  architecture  should 
possess.  A  more  detailed  description  of  the  use  of  these 
efforts  by  the  SMR  project  will  be  presented  in  the  meth¬ 
ods  definition  section  of  the  paper. 

Although  not  directly  applicable  to  the  architecture  of 
the  target  software  (i.e.,  the  operational  flight  programs), 
substantial  effort  has  been  expended  in  the  Software  Engi¬ 
neering  Environment  (SEE)  area.  This  work  is  important 
because  it  addresses  requirements  and  design  processes, 
and  the  production  of  DoD-STD-2167A  system  and  soft¬ 
ware  documentation.  For  example,  the  DoD-STD-2167A 
Software  Requirements  Specification,  and  Software 
Design  Documents  are  targeted  by  specific  tools  in  a  SEE. 
Some  of  the  more  notable  SEE  efforts  include: 

•  The  Joint  Integrated  Avionics  Working  Group 
(JIAWG)  SEE,  as  exemplified  by  the  F-22  program. 

•  The  Software  Technology  for  Adaptable,  Reliable 
Systems  (STARS)  program. 
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•  The  Navy’s  Next  Generation  Computer  Resources 
(NGCR)  E’rogram  Support  Environment  Specifica¬ 
tion  Working  Group  (PSESWG). 

•  The  National  Institute  of  Science  and  Technology 
(NIST)  environment  work. 

These  efforts  have  been  defining  reference  models  and 
standards  in  the  SEE  area.  The  SMR  project  will  utilize 
these  reference  models  and  standards  as  integration 
requirements.  The  potential  utilization  of  these  efforts  will 
be  discussed  in  the  process  definition,  and  tool  selection 
and  creation  sections  of  the  paper. 

13.  Derived  Requirements 

The  previous  two  subsections  have  briefly  discussed 
the  current  and  desired  states  of  the  F/A-18  system.  In 
addition  to  this  more  technically  oriented  desirable  proper¬ 
ties  view,  there  is  also  a  programmatic  and  acquisition  ori¬ 
ented  view.  This  view  demands,  because  of  the  need  to 
continue  to  serve  the  fleet  and  because  of  declining  bud¬ 
gets,  that  the  transition  from  the  current  system  state  to  the 
desired  system  state  be  evolutionary. 

Evolution  of  the  current  state  to  the  desired  state  will  be 
a  multi-year  effort,  with  several  incremental  improve¬ 
ments  to  the  current  state.  Evolution  will  occur  in  two 
areas: 

•  Product  evolution.  That  is,  the  transition  of  the  oper¬ 
ational  software  from  the  current  state  to  the  desired 
state. 

•  Process  evolution.  That  is,  the  transition  of  the  pro¬ 
cedures,  methods,  and  tools  used  to  create  and  main¬ 
tain  the  operational  system. 

In  the  very  near  term,  the  perceived  risk  in  converting 
any  part  of  the  OFP  to  Ada  is  very  high.  This  is  due  to  a 
number  of  factors.  First,  because  it  is  too  expensive  to 
convert  the  entire  OFP  to  Ada,  only  a  part  of  it  will  be  con¬ 
verted.  The  fact  that  only  a  part  of  the  OFP  will  be  con¬ 
verted  means  that  the  structure  of  the  existing  OFP  must 
be  very  well  understood  so  that  the  interfaces  between  the 
“old”  code  and  the  “new”  code  can  be  clearly  specified 
designed,  and  implemented.  Second,  this  initial  conver¬ 
sion  effort  also  involves  moving  from  a  16-bit  computer  to 
a  32-bit  computer. 

To  mitigate  this  programmatic  risk,  the  SMR  toolset 
will  be  utilized  to  aid  in  understanding  and  identifying  a 
subset  of  the  F- 18  avionics  Operational  Flight  Program 
(OFP)  to  be  modified  and  adapted.  Part  of  the  OFP  will 
continue  to  execute  on  an  Advanced  AYK-14  computer 
containing  a  VHSIC  Processor  Module  (VPM),  and  part  of 
the  OFP  will  execute  on  an  Intel  80960-based  processor 
module.  To  ensure  effective  operational  test  and  evalua¬ 


tion,  all  functional  and  performance  testing  will  be  con¬ 
ducted  in  the  NAWC  simulation  facility. 

In  addition  to  migrating  the  software,  requirements  in 
the  process  also  include  the  use  of  new  development  stan¬ 
dards  (e  g.,  DoD-STD-2167A),  more  advanced  software 
techniques,  and  a  more  advanced  SEE.  These  require¬ 
ments  mean  that  the  information  recovered  by  the  SMR 
toolset  must  be  consistent  with  the  information  require¬ 
ments  of  the  chosen  techniques,  and  that  the  SMR  toolset 
must  be  integrated  with  the  chosen  toolset. 

In  the  medium  term,  two  to  live  years,  there  is  a  desire 
to  integrate  new  operational  capabilities  into  the  F/A-18, 
and  to  employ  more  COTS  technologies.  Both  of  these 
require  thorough  understanding  of  the  existing  capabili¬ 
ties,  and  clear  specification,  design  and  implementation  of 
interfaces  between  the  existing  capabilities  and  the  new 
capabilities. 

The  long  term  objective  is  to  provide  a  comprehensive 
set  of  tools  for  orderly  incremental  and  semi-automated 
migration  of  16-bit  CMS-2  and  AYK-14  assembly  opera¬ 
tional  flight  programs  (OFP)  to  a  32-bit  RISC  architecture 
with  the  Ada  language.  The  SMR  project  will  integrate 
existing  applicable  process,  methods,  and  tools  while 
developing  missing  elements  for  providing  the  additional 
capabilities  and  uniting  the  disparate  pieces  together. 

Meeting  the  near,  medium,  and  long  term  objectives 
and  providing  an  integrated  solution  requires  the  SMR 
project  to  address  the  following  topics: 

•  Business  case  -  investigate  the  cost-effectiveness  of 
re-engineering  existing  software  versus  new  soft¬ 
ware  development. 

•  Processes  -  analyze  the  procedures  that  concern  not 
only  migration  and  re-engineering  but  new  function¬ 
ality  additions,  overall  system  architecture  modifica¬ 
tions,  and  interaction  with  systems  engineering,  test, 
and  software  reuse. 

•  Methods  -  consider  existing  software  understanding, 
inter-language  n-anslaiion  issues,  current  design  rep¬ 
resentation,  new  analysis  and  design  model  usage, 
and  reuse. 

•  Tools  -  analyze  software  programs  with  capabilities 
of  reverse  engineering,  design  viewing,  design  re¬ 
engineering,  and  forward  engineering. 

The  remaining  sections  of  the  paper  address  these  four 
topics. 
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2.  Business  Case 


As  discussed  in  [1],  there  is  a  reengineering  decision 
making  process,  which  consists  ot  four  steps;  1 )  identify 
alternative  strategies,  2)  perform  an  economic  assessment 
of  the  alternatives,  3)  select  the  preferred  strategy,  and  4) 
implement  the  selected  strategy. 

2.1.  Identification  of  Alternative  Strategies 

Several  different  reengineering  “strategies”  are  identi¬ 
fied  by  [1].  The  strategies,  along  with  their  definitions,  are: 

•  Restructuring.  The  engineering  process  of  trans¬ 
forming  the  system  from  one  representation  form  to 
another  at  the  same  relative  abstraction  level,  while 
preserving  the  subject  system’s  external  functional 
behavior. 

•  Redocumentation.  The  process  of  analyzing  the  sys¬ 
tem  to  produce  support  documentation  in  various 
forms  including  users  manuals,  and  reformatting  the 
systems’  source  code  listings. 

•  Reformat  The  engineering  process  of  reformatting 
the  existing  source  code  so  that  it  is  easier  to  under¬ 
stand  and  maintain,  while  preserving  the  subject  sys¬ 
tem’s  external  functional  behavior. 

•  Reverse  engineering.  The  engineering  process  of 
understanding,  analyzing,  and  abstracting  the  sys¬ 
tem  to  a  new  form  at  a  higher  abstraction  level. 

•  Forward  engineering.  Forward  engineering  is  the  set 
of  engineering  activities  that  consume  the  products 
and  artifacts  derived  from  legacy  software  and  new 
requirements  to  produce  a  target  system. 

•  Redevelop.  Redevelopment  refers  to  developing 
new  software  from  scratch  ignoring  any  existing 
software. 

As  part  of  the  reengineering  strategy  selection  process, 
a  number  of  questions  concerning  product  complexity, 
environmental  risk,  and  system  lifetime  are  used  as  a  part 
of  the  decision  making  process.  As  one  of  the  initial  tasks 
in  our  reengineering  effort,  we  applied  this  process  and 
derived  the  following  metrics: 

•  Average  Product  Complexity  Value  =  medium  (the 
actual  score  was  2.20). 

•  Environmental  Risk  =  medium  (the  actual  score  was 
2.05). 

•  Remaining  System  Life  =  Long. 

Table  1  is  the  matrix  from  [1].  The  Y  axis  of  this  table 
is  product  complexity,  and  the  X  axis  is  environmental 
risk.  The  highlighted  area  of  the  table  shows  tlie  reengi¬ 
neering  strategies  suggested  by  the  selection  procedure. 
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Although, -by  strict  application  of  this  procedure, 
restructuring  was  identified  as  the  advised  strategy,  it  was 
rejected  because  of  the  desire  to  transition  to  Ada. 
Restructuring  was  viewed  as  the  resuucturing  of  the  exist¬ 
ing  CMS-2  and  AYK-14  assembler  implementation. 

In  addition,  NAWC's  desire  is  to  possess  the  current 
OFP  software,  currently  written  in  CMS-2/AYK-14 
assembler,  rewritten  in  the  Ada  language.  This  conflicts 
with  the  accepted  restructuring  approach  of  being  in  the 
same  relative  abstraction  level;  both  would  be  in  software 
languages  but  of  very  disparate  types. 

The  two  chosen  alternative  strategies  are  reverse  then 
forward,  and  redevelop. 

2.2.  Economic  Assessment  of  the  Alternatives 

A  very  high  level  of  economic  analysis  was  performed 
using  the  COCOMO  model  with  the  1988  Ada  process 
model  which'is  consistent  with  DoD-STD-2167A.  A 
COCOMO  model,  with  the  size  of  the  system  estimated  to 
be  50K  of  newly  developed  Ada  software,  to  represent 
development  of  the  system  from  scratch — this  matches  the 
“redevelop”  strategy  identified  above.  The  effort  multipli¬ 
ers  were  set  according  to  the  characteristics  of  the  product 
(high  reliability  and  complexity),  people  (high  application 
experience  and  analyst  capability),  and  process  (high  mod¬ 
em  programming  practices  and  tools).  This  baseline  model 
yielded  a  schedule  duration  of  16.5  months,  effort  of  150.6 
staff  months,  and  productivity  of  331.9  SLOCs  per  staff 
month. 

In  order  to  calculate  the  potential  savings  from  a 
reverse  then  forward  strategy,  some  assumptions  about  the 
effectiveness  of  the  reverse  engineering  toolset.  The 
assumptions  were  as  follows: 
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•  High  Level  Design  Reuse  on  a  scale  of  0.0  (no 
reuse)  to  1.0  (full  reuse).  Since  the  current  design  is 
based  on  functional  decomposition,  and  an  object 
based  design  is  desired  we  set  this  factor  to  0.25. 

•  Software  Salvage  on  a  scale  of  0.0  (no  salvage)  to 
1.0  (full  salvage).  This  factor  is  an  indication  of  the 
amount  of  low  level  design  and  existing  code  that 
can  be  recovered.  This  factor  was  set  to  0.50. 

•  Translation  Efficiency  on  a  scale  of  0.0  (no  transla¬ 
tion)  to  1.0  (full  translation).  We  arbitrarily  set  this 
to  0.75. 

These  assumptions  lead  to  an  Adaptation  Adjustment 
Factor  (AAF)  equal  to  0.704875.  The  original  SLOC  esti¬ 
mate  of  50K  lines  is  multiplied  by  the  AAF  and  results  in  a 
SLOC  estimate  of  35244  which  is  fed  back  into  the  base¬ 
line  COCOMO  model.  This  COCOMO  model  yielded  a 
schedule  duration  of  14.4  months,  effort  of  101.2  staff 
months,  and  productivity  of  348.3  SLOCS  per  staff  month. 
Since  savings  are  usually  measured  in  effort  saved  we 
have  150.6  -  101.2  =  49.4  staff  months  saved. 

23.  Select  Preferred  Strategy 

We  believe  that  the  strategy  followed  will  be  a  reverse 
then  forward  strategy.  Even  if  a  redevelop  strategy  was 
envisioned,  the  development  staff  will  naturally  go  back  to 
the  existing  system,  and  reverse  engineer  some  if  not  most 
of  the  functionality.  Without  the  SMR  toolset  this  reverse 
engineering  activity  would  be  mostly  manual. 

2.4.  Implement  the  Selected  Strategy 

Implementation  of  the  reverse  then  forward  strategy  is 
very  much  like  a  new  development  supplemented  by  the 
reengineering  environment.  This  includes  integration  of 
reengineering  processes  with  the  “normal”  developmental 
processes:  integration  of  reengineering  methods  with  the 
chosen  development  methods:  and  integration  of  reengi¬ 
neering  tools  with  normal  developmental  tools.  The  key  is 


letting  the  people  control  the  application  of  the  processes, 
methods,  and  tools. 

3.  Process  Definition 

As  with  any  other  software  system,  an  accurate  process 
must  be  defined.  As  illustrated  in  Figure  2,  SMR  defines  a 
well-structured  process  for  migrating  CMS-2/AYK-14 
assembler  to  the  Ada  language.  The  following  subsections 
describe  the  essential  procedures  for  the  SMR  process  def¬ 
inition  and  additional  procedures  that  SMR  takes  into  con¬ 
sideration.  . 

l 

3.1.  Migration  and  Reengineering  Procedures 

Software  migration  provides  a  means  for  moving  func¬ 
tionality  representation  from  one  software  programming 
language  to  another.  Re-engineering  leverages  the  value  of 
pre-existing  system  software  assets  to  renovate  or  recon¬ 
stitute  it  in  a  new  form,  while  retaining  and  possibly  aug¬ 
menting  its  essential  functionality. 

Re-engineering  encompasses  certain  aspects  of  both 
reverse  and  forward  engineering.  The  reverse  engineering 
process  analyzes  an  existing  system  for  generating  its 
higher  level  abstractions  [6].  The  forward  engineering  pro¬ 
cess  moves  a  high-level  conceptual  abstraction  of  a  system 
to  its  logical  design  and  final  physical  implementation 
(i.e.,  code). 

The  initial  step  identifies  the  collection  of  software  that 
the  software  migration  and  re-engineering  process  will  be 
implemented  upon.  Once  identified,  the  next  step  involves 
parsing  the  identified  software  for  recovering  the  baseline 
design  into  an  alternate  representation  in  a  repository. 
Applying  a  level  of  language  specific  heuristics  during 
parsing  provides  the  ability  for  interpreting  the  entire  set 
of  code.  Once  in  this  alternate  representation,  the  recov¬ 
ered  information  is  better  suited  for  supporting  other 
migration  and  re-engineering  procedures. 


Because  the  SMR  project  does  not  w.uit  to  contlict  with 
the  steps  identified  in  DOD-STD-2167A,  careful  consider¬ 
ation  involved  this  critical  reverse  engineering  step.  The 
project  decided  that  reverse  engineering  would  only  go  to 
the  preliminary  design  level.  This  decision  was  primarily 
based  upon  project  life  and  resources  and  system/software 
requirements  recovery  complexity. 

The  next  step  provides  viewing  and  modification  facili¬ 
ties  of  the  recovered  design.  Static  and  dynamic  analysis 
assists  in  determining  the  various  components  and  their 
interrelationships.  This  captured  information  can  be 
viewed  through  higher-level  artifacts  such  as  control-tlow 
diagrams,  module  structures,  program  invocation  graphs, 
stratified  views,  and  variable  access  structures.  Different 
perspectives  of  the  recovered  information  facilitates  the 
understanding  of  the  system,  its  various  components,  and 
their  interaction.  Applying  domain  knowledge,  external 
information,  documentation,  deduction,  or  fuzzy  reason¬ 
ing  techniques  to  the  recovered  design  information  assists 
the  user  in  attaining  a  better  understanding  of  the  system 
to  be  re-engineered. 

While  tqiplying  appropriate  transformations  the  recov¬ 
ered  design  information  can  be  augmented  by  re-design, 
structuring,  modularization,  and  grouping.  A  level  of 
semi-automation  accelerates  the  re-engineering  activity. 
At  this  time,  available  maintenance  histories  can  be  evalu¬ 
ated  and  applied  to  the  re-engineered  design.  Design  re¬ 
engineering  provides  the  means  for  taking  advantage  of 
the  target  language  constructs  during  code  generation. 
Once  the  re-engineering  of  the  design  is  complete,  the  Ada 
language  representation  is  generated. 

The  code  generation  procedure  must  not  only  create 
fully  compilable  Ada  code,  it  must  also  adhere  to  the  cur¬ 
rent  MIL-STD-1815A  for  the  Ada  programming  language. 

The  next  step  implements  the  generated  code  through 
compilation,  linking,  and  run-time  execution  facilities. 
The  implementation  will  be  performed  hand-in-hand  with 
unit  test,  the  first  step  for  testing  DOD-STD-2167A  soft¬ 
ware  systems.  As  a  parallel  activity,  test  case  generation 
allows  for  subsequent  system  level  testing  of  the  migrated 
and  re-engineered  application. 

3.2.  Overall  System  Architecture  Modifications 

Some  programming  languages  require  specific  hard¬ 
ware  considerations  before  implementation,  especially 
assembly  type  languages.  This  dependency  could  have 
directed  the  original  design  and  implementation  of  the 
application.  Therefore,  migration  of  applications  from  one 
programming  language  to  anoUier  must  consider  the  over¬ 
all  system  architecture. 


A  function  of  the  design  recovery  step  will  aid  the  iden¬ 
tification  of  these  system  architecture  dependencies.  By 
recovering  to  an  informative  design  state,  the  recovered 
articles  are  viewed  for  their  original  implementation  and 
then  re-engineered  to  the  desired  design  state.  This  design 
state  re-engineering  allows  for  modification  of  the  overall 
system  architecture. 

3.3.  System  Engineering  Interaction 

Any  system  must  contain  a  definition  of  requirements 
and  design.  The  logical  steps  for  performing  software 
development  implements  the  DOD- STD-2 167 A  method¬ 
ology  include  system/segment  requirements,  system/seg¬ 
ment  design,  software  requirements,  and  software  design. 
When  integrating  reverse  engineering  with  forward  engi¬ 
neering,  the  process  must  identify  the  logical  insertion 
point  into  the  DOD-STD-2167A  methodology.  In  addi¬ 
tion,  successful  reverse  engineering  promotes  scalability 
with  system  engineering  processes. 

The  migration  and  re-engineering  process  implements 
steps  for  reverse  engineering  of  an  assembly-based  appli¬ 
cation,  re-engineering  of  the  recovered  baseline  design, 
and  forward  engineering  for  implementation  onto  a  new 
hardware  architecture.  The  current  focus  of  the  SMR 
project  performs  reverse  engineering  to  an  abstracted 
design  state.  This  abstracted  design  state  relates  to  the  pre¬ 
liminary  software  design  for  DOD-STD-2167A  software 
development.  The  next  logical  reverse  engineering  steps 
recovers  software  requirement  and  then  system  require¬ 
ment/design  information. 

3.4.  Maintenance  Processes 

Ideally,  a  software  system  will  have  a  long  lifetime, 
maximizing  the  benefit  of  the  initial  development.  Re¬ 
designs,  upgrades,  and  extensions  require  knowledge  of 
the  implemented  system.  Usually,  this  knowledge  can  be 
lost  when  the  original  implementors  are  no  longer  avail¬ 
able.  Therefore,  maintenance  should  always  be  a  contrib¬ 
uting  factor  when  defining  a  software  system  process. 

The  SMR  project  realized  the  necessity  of  a  strong 
maintenance  process  because  of  the  underlying  purpose  of 
reverse  engineering  a  legacy  system.  SMR’s  identified 
process  provides  an  increase  of  future  maintenance  effi¬ 
ciency  by  providing  a  scalable  and  adaptable  design  repos¬ 
itory.  Current  trends  in  software  engineering  practices  for 
software  system  development  show  programming  evolv¬ 
ing  to  designing.  Once  the  design  is  complete  and  stored  in 
a  repository,  a  code  generator  can  be  utilized  for  the  target 
software  application,  reducing  the  burden  upon  specific 
programming  language  knowledge. 
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3.5.  Testing 

Another  integral  part  of  a  software  migration  and  re¬ 
engineering  process  includes  testing.  Testing  provides  the 
feedback  for  securing  the  integrity  of  the  developed  soft¬ 
ware  system.  As  described  in  section  S,  the  SMR  toolset 
provides  reverse  engineering  while  the  F/A-18  SEE  pro¬ 
vides  forward  engineering  and  testing.  Therefore,  the 
SMR  toolset  will  not  have  any  impact  on  the  normal  unit 
and  system  testing  procedures. 

3.6.  Documentation 

A  comprehensive  process  must  provide  a  means  for 
sufficiently  documenting  the  migrated  and  re-engineered 
software  system.  At  each  step  of  the  SMR  process,  a  path 
provides  production  of  the  necessary  documentation. 
Good  documentation  aids  in  knowledge  retention  and 
future  maintenance. 

3.7.  Reuse 

The  defined  SMR  process  will  hopefully  further  pro¬ 
mote  the  notion  of  reuse.  By  including  an  object-oriented 
design  repository,  the  SMR  process  promotes  reuse  of 
software  design  objects.  In  addition,  creating  the  target 
application  with  the  Ada  language  supports  reuse  of  its 
software  modules. 

4.  Methods  Dehnition 

Figure  3  depicts  a  highly  simplified  view  of  a  re-engi¬ 
neering  example  based  on  the  SMR  problem.  At  this  level, 
reengineering  seems  deceptively  simple.  It  seems  as 
though  it  should  be  possible  to  write  computer  programs 
that  can  understand  the  existing  F-18  Operational  Flight 
Program(s),  which  are  written  in  a  mixture  of  CMS-2  and 
AYK-14  assembly  language,  and  generate  some  interme¬ 
diate  representation.  From  this  intermediate  representation 
one  could  then  forward  engineer  to  Ada  source  code. 
However,  very  few,  if  any,  high-level  design  decisions  can 
be  captured  by  examining  existing  source  code.  It  seems 
that  this  is  true,  independent  of  the  implementation  lan¬ 
guage. 
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Figure  3.  High-Level  View  of  Reengineering 

Although  we  may  expect  a  significant  amount  of  the 
intermediate  representation  can  be  automatically  gener¬ 
ated  by  the  reverse  engineering  tools,  we  must  recognize 
that  there  are  key  pieces  of  information  that  will  be  manu¬ 
ally  entered  into  the  intermediate  representation  by  the 
people  working  on  the  SMR  project.  Given  the  limited 
financial  resources  of  the  SMR  project,  it  is  not  our  desire 
to  create  a  new  intennediate  representation.  For  at  least 
these  reasons,  the  choice  of  the  intermediate  representa¬ 
tion  must  leverage  mainstream  commercial  Computer 
Aided  Systems  or  Software  Engineering  (CASSE)  tools. 

4.1.  Software  Understanding 

In  order  to  achieve  the  maximum  “return  on  invest¬ 
ment”  from  the  existing  system,  there  are  a  number  of 
things  we  wish  to  abstract  from  the  existing  software. 
i\mong  them  are: 

•  Data  items. 

•  Subroutines. 

•  Data  reference  patterns. 

•  Processes. 

One  of  the  first  tasks  in  understanding  the  existing  sys¬ 
tem  is  to  identify  what  the  major  data  items  are;  that  is, 
what  are  the  major  variables.  Initially,  it  is  only  important 
to  discover  the  names  of  the  variables  and  their  relative 
size.  At  a  later  time — when  more  is  understood  about  how 
the  variables  are  used  — decisions  can  be  made  as  how  to 
repre.sent  them  in  Ada;  that  is,  what  type  to  use. 

Another  task  is  to  identify  the  subroutines  that  exist  in 
the  current  implementation. 

With  these  two  pieces  of  information,  an  important 
ingredient  is  to  identify  what  the  reference  patterns  of  sub- 
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routines  to  data  items  are.  The  importance  of  this  type  of 
information  has  been  cited  several  limes  in  the  literature 
[4,  12],  and  appears  to  be  important  regardless  ot  source 
and  target  language  choices.  This  information  is  important 
for  at  least  two  reasons. 

Fu'st,  from  the  point  of  view  of  consu-ucting  an  “object- 
based,”  or  “object-oriented”  design*  it  is  important  to 
understand  what  subroutines  use  what  data  and  what  kind 
of  use  it  is  (i.e.,  read  or  write).  This  will  promote  a  modu¬ 
lar  design  approach  based  on  information  hiding  [14]  con¬ 
cepts,  rather  than  functional  decomposition. 

The  second,  more  detailed,  tidbit  of  understanding  this 
enables  has  to  do  with  how  the  subroutines  use  the  data 
structures.  It  is  hypothesized  that  this  type  of  understand¬ 
ing  will  support  the  selection  of  types  (e.g.,  records, 
arrays),  and  perhaps  some  representation  issues  (e.g., 
memory  layout)  as  well. 

Another  piece  of  information  which  is  important  to 
know  in  order  to  understand  is  the  existence  of  “pro¬ 
cesses."  Many  systems,  both  real-time  and  non  real-time, 
are  measuring  physical  parameters.  Examples  of  this  occur 
when  the  software  is  implementing  things  like  differential 
equations,  integrals,  or  kalman  filters.  These  types  of 
mathematically  oriented  applications  have  time-depen- 
dent,  cyclic  behaviors  that  are  extremely  important  to  rec¬ 
ognize  in  order  to  more  fully  understand  the  existing 
software. 

There  may  be  several  other  more  hardware  oriented 
pieces  of  information  that  need  to  be  recognized  in  order 
to  create  a  more  robust  approach  to  software  understand¬ 
ing.  An  example  of  this  is  addresses  that  correspond  to 
memory-mapped  I/O.  The  approach  used  for  software 
understanding  will  need  to  remain  flexible  enough  to 
incorporate  these  pieces  of  information. 

4.2.  Current  Design  Representation  Issues 

One  of  the  more  difficult  choices  is  how  to  represent 
the  information  extracted  from  the  current  state.  As  indi¬ 
cated  in  Figure  3,  the  information  could  be  represented  as 
entity-relationship  diagrams,  abstract  syntax  trees,  data 
flow  diagrams,  and  so  forth. 

Many  different  approaches  to  the  intermediate  repre¬ 
sentation  have  been  tried  [10,11].  As  we  move  to  represen¬ 
tations  that  attempt  to  capture  more  of  the  semantic 
content  of  the  existing  state;  for  example,  problem  domain 
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oriented  design.  'The  purpose  of  Uiis  phrase  is  to  let  the  reader  know 
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and  that  Ada  83  is  more  object-based  than  it  is  object-orienleil. 


oriented  concepts,  there  seems  to  be  a  need  to  move 
towards  knowledge-based  representations.  One  the  key 
ingredients  in  lepre.sentation  is  llexibility. 

4.3.  Inter-Language  IVanslation  Issues 

One  of  the  principle  arguments  against  inter-language 
translation  approaches  is  that  they  do  not  take  advantage 
of  the  power  of  the  target  language.  Because  the  target  lan¬ 
guage  is  Ada.  the  features  discussed  in  the  following  para¬ 
graphs  possess  the  “power”  that  the  SMR  design  recovery 
approach  will  take  advantage  of: 

Packages  are  one  of  the  most  powerful  features  of  the 
Ada  language,  because  they  enforce  many  of  the  software 
engineering  principles  cited  in  section  1.3.  Examples  of 
the  principles  packages  directly  support  are:  modularity, 
information  hiding,  and  standard  interfaces. 

Use  of  packages  coupled  with  the  use  of  strong  typing 
will  increase  the  reliability  and  readability  of  the  opera¬ 
tional  software.  In  particular,  private  and  limited  private 
types  will  be  key  to  the  SMR  design  recovery  approach. 
Effective  use  of  these  Ada  features  will  facilitate  data 
independence  and  localization  of  information  about  the 
implementation  technology.  Since  implementation  tech¬ 
nology  is  changing  rapidly,  localization  of  information 
pertaining  to  the  current  implementation  technology  will 
be  a  key  factor  in  adhering  to  the  continuous  change  prop¬ 
erty. 

Ada  tasks  are  also  an  important  feature  since  they  allow 
the  expression  of  concurrency.  Concurrency  in  real-time 
systems  has  been  a  major  source  of  concern  over  the  past 
few  years.  The  SMR  design  recovery  approach  will  utilize 
Ada  tasking  and  rate  monotonic  analysis  concepts  in  order 
to  facilitate  recovery  of  the  concurrent  aspects  of  the  exist¬ 
ing  design. 

Finally,  if  the  new  software  is  expected  to  be  reused  in 
other  applications,  Ada  generics  may  be  an  important  fea¬ 
ture  to  utilize.  Some  recent  work  has  cited  generics  as 
being  key  to  reusable  Ada  software. 

4.4.  New  Analysis  and  Design  Model  Usage 

When  performing  a  uansformation  from  a  source  form 
to  some  target  form,  it  is  important  to  understand  both 
forms.  Sections  4.1  and  4.2  dealt  with  the  definition  of  the 
source.  This  section  deals  with  the  definition  of  the  target 
form. 

It  is  important  to  remember  that  in  addition  to  support¬ 
ing  the  reengineering  and  migration  of  the  F/A-18  soft¬ 
ware  from  CMS-2M/AYK-14  assembly  to  Ada  upgrading 
the  documentation  to  DoD-STD-?167A  is  also  desirable. 
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Fortunately,  substantial  work  has  already  been  done  in  diis 
area. 

Analysis  and  design  methods  and  notations  have  been 
in  existence  for  many  years.  Within  the  last  ten  years  or  so, 
many  CASE  tools  have  emerged  to  assist  in  the  implemen¬ 
tation  of  these  methods  and  creation  of  the  associated 
notations.  There  are  a  number  of  SEE  efforts  that  are  try¬ 
ing  to  create  integrated  collections  of  these  tools. 

A  prime  example  of  these  efforts  is  the  F-22  SEE. 
Although  its  definition  may  not  be  entirely  complete,  sev¬ 
eral  leading  CASE  tools  focusing  on  analysis  and  design 
are  a  part  of  that  SEE.  Examples  include:  Ascent  Logic’s 
RDD-lOO;  I-Logix’  Stalemate;  Cadre’s  Teamwork;  IDE’s 
Software  through  Pictures;  and  SES/Workbench. 

Personnel  from  the  SMR  project  have  investigated  each 
of  these  tools,  either  through  previous  experience  or 
through  current  work  at  their  respective  organizations. 
Because  of  customer  preferences,  and  the  desire  to  reengi¬ 
neer  the  existing  software  to  an  Ada-based  design  repre¬ 
sentation,  we  have  decided  to  concentrate  on  Cadre’s 
Teamwork/Ada  product. 

Each  CASE  tool  implementation  is  trying  to  support 
the  automation  of  one  (or  more)  technical  methodologies. 
For  example,  there  are  different  versions  of  Cadre’s  Team¬ 
work.  Teamwoik/SART  supports  the  Hatley/Pirbhai  meth¬ 
odology  [19],  Teamwork/OOA  supports  for  Object- 
Oriented  Analysis  [20],  and  Teamwork  Ada  supports  the 
Buhr  notation  [21].  There  is  also  a  version  of  Teamwork 
that  supports  the  ADARTS  methodology  [8],  which  basi¬ 
cally  integrates  concepts  from  Hatley/Pirbhai  and  Buhr 
(unfortunately  we  have  been  unable  to  evaluate  this  prod¬ 
uct). 

Each  methodology  has  a  preferred  style  of  representing 
information  about  requirements,  design,  or  both.  The 
methodology  usually  specifies  a  number  of  graphical  sym¬ 
bols,  the  problem  domain  concept  that  the  symbol  stands 
for,  what  other  information  may  be  associated  with  each 
symbol,  relationships  among  symbols,  and  rules  for  con¬ 
structing  representaticms  that  contain  many  symbols. 

For  example,  the  Hatley/Pirbhai  methodology  uses  data 
and  control  flow  diagrams  as  the  primary  representations, 
and  includes  state  transition  diagrams.  Shlaer/Mellor  uses 
a  form  of  entity-relationship  diagrams,  supported  by  data 
and  control  flow  diagrams. 

The  goal  for  SMR  is  to  perform  design  recovery  back 
to  one  of  these  representations.  It  may  be  possible  to 
recover  to  several  of  these  representations;  however,  due 
to  funding  limitations,  the  project  team  decided  upon  one 
option,  the  Buhr  notation.  This  notation  was  chosen 


bccau.se  of  the  inherent  Ada  structure  graph  qualities  tliai 
tlte  target  software  programming  language. 

5.  Tool  Selection  and  Creation 

As  with  any  other  software  system,  the  SMR  project 
needs  computer  models  for  supporting  the  defined  process 
and  methods.  Through  software  tools,  the  computer  mod¬ 
els  provide  the  semi-automated  means  for  assisting  the 
user  through  the  defined  process.  The  following  sub-sec¬ 
tions  provide  a  high-level  description  of  the  F/A-18  SEE 
and  SMR’s  relationship  with  it,  describe  the  process  for 
tool  selection,  identify  the  additional  software  necessary, 
and  present  the  SMR  toolset  solution. 

5.1.  F/A-18  Software  Engineering  Environment 

The  F/A-18  Software  Engineering  Environment  (SEE) 
contains  a  collection  of  software  engineering  tools  for  sys¬ 
tem  analysis  and  design,  software  analysis  and  design, 
code  development,  compiling,  debugging,  configuration 
management,  documentation,  requirements  traceability, 
and  project  management.  The  F/A-18  SEE  provides  the 
foundation  for  future  F/A-18  new  software  development. 
While  being  comprehensive  for  forward  engineering  (i.e., 
systems  analysis  through  code  generation  and  testing),  the 
SEE  lacks  reverse  engineering  for  information  capture  of 
existing  applications. 

The  SMR  project  solves  the  reverse  engineering 
dilemma  by  integrating  the  SMR  toolset  to  the  F/A-18 
SEE.  SMR  provides  the  reverse  engineering  and  design  re¬ 
engineering  capabilities  for  existing  CMS-2/AYK-14 
assembly  based  applications.  The  solution  interfaces  the 
SMR  design  representation  to  the  SEE  equivalent.  Cadre 
Teamwork/Ada.  Cadre  Teamwork/Ada  is  a  design  level 
software  tool  represented  with  an  Ada  structure  graph 
notation.  The  SMR  toolset  provides  a  natural  enhancement 
to  the  F/A-18  SEE. 

As  seen  in  figure  4,  the  F/A-18  SEE  provides  a  refer¬ 
ence  model  defining  a  foundational  platform  for  presenta¬ 
tion,  process,  control  and  data  elements.  The  platform 
allows  software  tools  to  be  configured  and  integrated 
within  the  SEE.  The  SMR  toolset  fills  another  slot  in  the 
reference  model. 
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Figure  4.  SEE  Reference  Model 
5J.  Investigation  of  COTS  Tools  [28] 

Software  engineering  tools  provide  an  expanding  vari¬ 
ety  of  capabilities  for  automating  manual  software  engi¬ 
neering  process  and  methods.  These  software  tools  have 
evolved  from  prototypes  to  powerful,  fully  supported 
commercial  tools  for  small  and  large  scale  development.ln 
addition,  tool  vendors  have  modified  their  attitudes  from 
single  point  solutions  to  integrated  toolsets.  Commercial 
tools  are  increasingly  addressing  previous  gaps  in  cover¬ 
age,  especially  those  in  program  understanding  and 
reverse  engineering. 

Collectively,  over  a  hundred  software  companies  pro¬ 
duce  hundreds  of  various  software  tools  that  relate  to  most 
of  the  SMR  domain.  A  rigorous  evaluation  of  currently 
available  software  tools  is  therefore  required  to  identify 
those  most  applicable  and  effective  for  the  SMR  project. 
In  addition,  the  evaluation  survey  assesses  the  current 
technology  state,  uncovers  emerging  standards,  and  pro¬ 
vides  proper  information  for  proposing  the  solution. 

In  order  to  decrease  the  amount  of  evaluation  time  and 
increase  evaluation  efficiency,  the  SMR  project  defined  an 
evaluation  survey  process; 

1 .  Collect  vendor/tool  prospects  -  sources  include  in- 
house  knowledge,  trade  magazines,  mariceting 
advertisements,  STSC  list,  Ada  Letters,  and  CASE 
ftoduct  Guide, 

2.  Contact  vendors  -  this  step  obtains  the  technical 
information  needed  for  product  evaluation, 

3.  Evaluate  product  information  (first  phase)  -  first 
phase  evaluation  provides  SMR  relevancy  informa¬ 
tion  in  order  to  down-select. 


4.  Generate  first-pass  (ilter  criteria  -  basic  SMR  scope 
level  including  programming  languages  (i.e.  CMS-2 
or  Ada),  reverse  engineering,  design  viewing,  design 
re-engineering,  and  forward  engineering, 

5.  Perform  more  in-depth  technical  evaluation  -  hands- 
on  usage,  in-depth  discussions  with  vendor’s  techni¬ 
cal  department,  collaborative  discussions, 

6.  Generate  second-pass  filter  criteria  -  interfaces  to 
other  tools,  tool  scalability,  platform  and  environ¬ 
ment,  collaboration  openness,  industry  standards 
conformity,  and 

7.  Propose  tool(s)  for  SMR  solution. 

After  the  first  two  steps,  it  was  apparent  that  much 
research  and  development  in  the  software  migration  and 
re-engineering  field  has  resulted  in  many  supportive  soft¬ 
ware  tools.  In  any  event,  87  software  tools  were  identified 
that  conceptually  relate  to  the  SMR  project.  Further  inves¬ 
tigation  showed  many  of  these  toots  aid  in  migrating  and 
re-engineering  software  written  in  FORTRAN,  COBOL, 
C,  and  Pa.scal  to  other  languages.  A  few  CMS-2  to  Ada 
translators  exist,  but  none  are  capable  of  handling  the 
assembly  language  poruons  of  the  system. 

After  steps  three  and  four,  29  tools  did  not  require  fur¬ 
ther  evaluation  due  to  the  vendor’s  market  focus  (i.e., 
MVS/Cobol  business  application  market).  Another  40 
tools  were  marked  for  contingency  evaluation,  such  as  a 
broadening  in  SMR  scope  or  necessity  for  acquiring  addi¬ 
tional  technical  approach  and  knowledge.  A  group  of  18 
tools  or  toolsets  were  nominated  for  further  evaluation. 
The  following  is  a  listing  of  those  tools  nominated  for 
more  in-depth  technical  evaluation: 

Advanced  Systems  Technology  Corp.  -  Astec’s  2 167 A 
Tool  Set  is  an  extension  to  the  CaMERA’s  semantic  data¬ 
base  repository  capabilities  for  producing  DOD-STD- 
2167A  documentation. 

Ascent  Logic  Ascent  Logic’s  RDD-lOO  tool  provides 
full  requirements  analysis  and  traceability,  behavioral 
modeling  and  simulation,  luid  component  allocation. 

CACl  -  CACI’s  GenEleC  generates  complete  reusable 
Ada  packages  from  CASE  component  specification  via 
data  element  specifications. 

Cadre  -  Cadre’s  CASE  tools  contain  the  Teamwork 
series,  providing  graphical  representation  of  software  sys¬ 
tems  for  analysis,  design,  and  code  generation.  Cadre  pro¬ 
motes  the  CDIF  data  repository  protocol  which  allows 
interfacing  to  other  CDIF  compliant  software  tools. 

Carleton  University  -  TimeBench  is  a  CAD  tool  for  the 
design  of  real-time  systems  and  supports  an  extended  ver¬ 
sion  of  Buhr’s  “MachineCharts"  diagrairuning  notation. 
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DEXZ-1  -  DDC-I  provides  a  CASli  uh)1  box  lor  soltwiire 
development  life-cycle  support  from  requirements  delini- 
tion  to  maintenance. 

Hewlett-Packard  -  HP’s  SofiBench3.0  framework  is  a 
tool  integration  platform  which  combines  SofiBench  pro¬ 
gram  construction  tools,  encapsulated  third-party  develop¬ 
ment  tools,  and  one’s  own  custom  development  tools. 

Hill  AFB  Software  Technology  Support  Center  -  The 
STSC  has  the  JOVIAL  Re-engineering  Tool  Set  (JRETS), 
which  automatically  analyzes  JOVIAL  source  code  and 
extracts  design  information. 

i-Logix  -  i-Logix  ‘s  CASE  tool,  Statemate,  is  a  systems 
engineering  tool  permitting  the  graphic  modeling  and  the 
design  of  complex,  reactive  systems. 

Interactive  Development  Environments  (IDE)  -  IDE’s 
CASE  tools  include  Software  through  Pictures  (StP),  pro¬ 
viding  graphical  representation  of  software  systems  for 
analysis  and  design.  Integrated  with  IDE’s  Ada  Develop¬ 
ment  Environment,  the  user  can  generate  Ada  code. 

McCabe  &  Associates  -  McCabe  has  the  Battlemap 
Analysis  Tool  (BAT),  a  software  reverse  engineering  and 
maintenance  tool  which  analyzes  system  level  source  code 
and  calculates  McCabe  complexity  metrics. 

Mark  V  Systems  -  Mark  V  Systems'  CASE  tools 
include  ObjectMaker  and  Adagen.  ObjectMaker  is  a  user- 
tailorable  CASE  workbench,  while  Adagen  provides  the 
Ada  00  component  with  Ada  code  generation. 

MCC  -  Previous  consortial  research  at  MCC  generated 
design  information  recovery  (DESIRE)  technology. 
DESIRE  contains  software  for  scanning  and  parsing  pro¬ 
gramming  languages,  intermediate  object  base  representa¬ 
tion,  graphical  entity  relationship  modeling,  design 
viewing,  and  object-oriented  design  representations. 


Naval  Surface  Warfare  Center  -  The  NSWC  has  On- 
Line  T(X)ls  (OLTools),  a  collection  of  report  generation 
utilities  for  program  designs  involving  CMS-2  source 
files.  Target  System  Files,  and  AECilS  SYSBLD/7  .specifi¬ 
cation  files,  amongst  others. 

Oregon  Suite  University  (OSU)  -  OSU  has  a  parser  as  a 
front-end  to  a  compiler  for  CMS-2  and  AYK- 14  assembler 
code. 

Purdue  University  -  The  Purdue  Compiler-Constfuction 
Tool  Set  (PCCTS)  provides  a  set  of  tools  for  constructing 
compilers  and  code-to-code  tfanslators. 

Quantasm  Corp.  -  Quantasm  has  the  ASMFlow  Profes¬ 
sional,  an  assembly  language  Bow  charting  and  source 
code  analysis  tool  for  the  IBM  PC. 

Reasoning  Systems  -  Reasoning  Systems  supports  and 
licenses  two  customizable  software  and  maintenance  and 
re-engineering  tools.  Software  Refinery  and  REFINE. 
Software  Refinery  allows  for  building  automated  software 
processing  tools,  while  REFINE  provides  source  code 
analysis  and  re-engineering. 

Performing  steps  five  and  six  allowed  die  SMR  project 
team  to  concentrate  on  three  tools:  Cadre  Teamwork/Ada, 
MCC  DESIRE,  and  OSU  parser.  In  addition  to  these  tools, 
other  tool  finalists  include  IDE  Software  through  Pictures 
and  Reasoning  Systems  Software  Refinery. 

5.3.  The  Integrated  Toolset 

Although  COTS  tools  were  selected  for  the  SMR 
toolset,  they  do  not  provide  a  100%  solution  to  the  process 
definition.  In  addition,  the  tools  are  not  currently  compati¬ 
ble  and  need  to  be  integrated  together.  The  SMR  project 
intends  to  leverage  previous  research  on  software  engi- 
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neering  processes  and  methodologies  with  the  three  COTS 
tools  to  provide  an  integrated  toolset. 

The  SMR  toolset,  illustrated  in  Figure  5,  integrates  cur¬ 
rently  available  software  and  COTS  tools  with  project 
developed  software.  The  developed  software  includes 
intermediate  representation  generation,  recovered  design 
information  understanding,  design  re-engineering  (includ¬ 
ing  object  clustering),  intermediate  representation  to 
CASE  tool  transmission,  and  interactive  graphical  user 
interface  software. 

Since  the  OSU  parser  is  a  front-end  to  a  compiler,  the 
software  does  not  have  capability  for  stand-alone  execu¬ 
tion.  Therefore,  the  parser  will  be  integrated  with  Interme¬ 
diate  Representation  Generation  Tool  software  that  loads 
the  recovered  information  into  the  Intermediate  Represen¬ 
tation  Repository  (IRR).  The  IRR  represents  the  recovered 
information  in  an  object  base  format  and  organized  to 
reflect  the  lexical  structure  of  the  program  source  [29]. 

For  information  retrieval,  the  SMR  project  creates  and 
exercises  a  Recovered  Design  Information  Understanding 
Tool,  software  for  understanding  the  recovered  design 
information.  This  tool  interprets  the  IRR  and  then  gener¬ 
ates  graphical  views  and  slices  for  program  and  design 
understanding. 

Since  the  recovered  information  will  not  be  conducive 
to  object-oriented  or  Ada  design  notations,  the  SMR 
project  provides  a  Design  Re-engineering  Tool.  This  tool, 
in  conjunction  with  the  understanding  tool,  provides  a 
semi-automated  method  for  clustering  recovered  informa¬ 
tion  into  the  appropriate  design  notation.  For  example,  the 
user  would  have  the  ability  to  select  variables  or  program 
modules  and  associate  them  as  objects.  In  addition,  this 
tool  would  permit  a  functional  re-design  cf  the  recovered 
information. 

The  last  tool  developed  for  the  SMR  project  is  a  Data 
Transmission  Tool,  software  that  interprets  the  intermedi¬ 
ate  representation  repository  and  exports  data  to  CASE 
tool  via  a  CDIF  interface  protocol.  The  CASE  tool  pro¬ 
vides  code  generation  as  a  part  of  tlie  F/A-18  SEE  and  is 
compliant  with  CDIF. 

The  SMR  toolset  is  exercised  through  a  highly  interac¬ 
tive  graphical  user  interface  (GUI).  The  X-window  system 
and  Motif  graphical  widget  set’s  inherent  object  capabili¬ 
ties  enriches  the  human  interaction  with  the  toolset. 

6.  Conclusion 

The  SMR  Pilot  Project  contributes  many  benefits  to  die 
F/A-18  community  and  in  the  software  engineering 
domain.  By  integrating  the  SMR  toolset  into  the  F/A-18 
SEE,  the  software  development  branch  receives  a  cohesive 


solution.  This  integrated  solution  creates  a  mechanism  to 
capture  and  take  advantage  of  pre-existing  engineering 
awareness  and  domain  expertise  for  the  current  version  ol 
the  avionics  software.  Automating  the  capture  of  this 
information  provides  the  means  for  an  increase  in  effi¬ 
ciency  in  migrating  future  avionics  software. 

In  addition,  the  SMR  project  advances  research  in  the 
software  migration  and  re-engineering  domain.  Since 
much  legacy  code  is  assembly  based,  these  legacy  system 
owners  and  maintainers  gain  an  automated  means  for  cap¬ 
turing  program  knowledge.  Interfacing  the  design  infor¬ 
mation  recovery  with  forward  engineering  offers  a  method 
for  migrating  legacy  software  to  modem  and  future  lan¬ 
guages. 
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Abstract 

Large  information  systems  often  require  the  fusion  of 
multiple  databases  to  achieve  desired  functionality.  In  this 
paper,  we  focus  on  how  automated  query  formulation 
capabilities  may  be  realized  over  a  set  of  fused  databases. 
Reverse  engineering  issues  related  to  database  design  and 
fusion  are  discussed,  and  a  query  formulation  and  design 
system  known  as  QUICK  is  described.  A  case  study  is 
presented  in  which  the  logical  schemas  for  two  independent 
U.  S.  Army  databases  are  reverse  engineered  into  conceptual 
schemas  that  ate  subsequently  used  for  data  fusion  and 
automatic  query  generation.  In  addition,  enhanced  methods 
that  employ  meu-level  conceptual  constructs  to  support 
reverse  engineering,  data  fusion,  and  query  formulation  are 
described. 


1.  Introduction 

In  the  course  of  reengineering  a  large  information  system, 
it  is  often  necessary  to  fuse  multiple  databases  to  provide 
desired  functionality.  From  a  systems  engineering 
perspective,  database  fusion  should  be  modeled  explicitly 
to  facilitate  information  system  development  and 
maintenance.  As  a  minimum,  database  logical  schemas 
should  be  modified  to  ensure  consistency  among  attributes 
and  to  foster  interaction  among  participating  databases.  In 
many  cases,  however,  modification  of  existing  databases 
is  not  feasible  (e.g.,  operational  or  legal  constraints  preclude 
database  design  modification).  In  such  cases,  participating 
databases  must  be  associated  in  a  nonintrusive  manner. 
To  a  great  extent,  such  association  can  be  accomplished 
by  reverse  engineering  logical  schemas  into  conceptual 
schemas  and  relating  participating  databases  via  appropriate 
meta-level  representation  constructs. 

Even  when  database  fusion  is  realized,  however,  it  is 
often  difficult  for  users  to  formulate  queries  over  panicipating 


databases.  In  particular,  users  must  be  aware  of  logical-level 
query  languages  as  well  as  underlyiiig  logical  structures 
and  fusion  links.  For  complex  systems,  queries  may  be 
elaborate,  and  may  take  many  hours  (and  iterations)  to 
write.  Query  formulation  problems  are  exacerbated  when 
the  participating  databases  are  large,  and  multiple  logical 
models  are  employed.  Yet,  by  using  reverse-engineered 
conceptual  schemas,  it  is  possible  to  abstract  user  interaction 
from  the  logical  level  to  the  concq>tual  level.  Furthermore, 
with  appropriate  meta-level  abstractions,  the  need  for 
conceptual-level  knowledge  can  be  significantly  reduced. 

In  this  paper,  we  focus  on  how  queries  can  be 
automatically  generated  from  high-level  requests  posed  over 
a  set  of  fused  daubases.  In  the  next  section,  a  system 
known  as  QUICK  (for  “QUICK  is  a  Universal  Interface 
with  Conceptual  Knowledge”)  is  described.  QUICK 
supports  database  reverse  engineering  and  automated  query 
formulation  by  modeling  global  logical  schemas  at  the 
conceptual  level.  In  Section  3,  reverse  engineering  issues 
are  discussed  and  the  implications  of  data  fusion  are 
considered.  In  Section  4,  relevant  portions  of  two  U.  S. 
Army  databases  are  described,  and  corresponding  portions 
of  the  reverse-engineered  conceptual  schemas  are  presented. 
Then,  in  Section  S,  specific  data  fusion  issues  regarding 
the  U.  S.  Army  databa^  are  described.  Finally,  in  Section 
6,  some  enhancements  are  proposed  to  improve  the 
performance  Ci  QUICK  for  complex  systems  reengineering 
and  development 

2.  Query  Formulation  and  Contexts 

Formulating  queries  over  a  complex  database  is  a 
cognitive  activity  that  requires  extensive  knowledge  of  the 
syntax  of  a  database  query  language  and  the  structure  of 
the  underlying  database.  I^ile  query  language  details  can 
be  masked  by  high-level  interfaces,  requirements  for 
knowledge  of  the  structural  associations  are  not  so  easily 
hidden.  For  example,  most  graphical  user  interfaces  require 


that  users  specify  f^plicil  navigational  paths  and  join 
criteria  [4,22],  However,  for  complex  databases,  there 
may  be  many  ways  to  associate  data,  and  only  some  of 
tiKMe  ways  ntay  be  semantically  meaningful.  In  addition, 
graphical  interfaces  become  difficult  to  use  with  large 
databases  that  require  many  screens  for  representing  high- 
level  aggregate  database  objects;  in  such  cases,  a  user 
must  navigate  among  M:reens  as  well  as  database  objects. 

An  alternative  to  graphical  interfaces  is  the  universal 
relation  (UR)  interface  approach  [10,12,20].  With  a  UR 
mteitace,  a  user  is  given  the  impression  that  all  database 
adributes  are  represented  in  a  single  relation.  Thus,  querying 
requires  only  the  specification  of  attributes  and  high-level 
constraints,  and  the  interface  is  responsible  for  determining 
semantically  reasonable  associations  and  inferring 
qipropriate  joins. 

Several  approaches  have  been  developed  for  developing 
UR  interfaces  [9,13].  One  of  the  more  popular  approaches 
is  based  on  the  notion  of  maximal  objects  [11],  where 
associations  among  attributes  are  inferred  based  on  specified 
functional  dqiendencies.  While  based  on  sound  principles, 
there  are  several  problems  with  the  approach.  First,  it 
fails  to  use  the  constructs  with  which  a  database  designer 
is  most  familiar.  Thus,  a  separate  knowledge  base  is 
required  that  is  orthogonal  to  that  used  for  design.  Second, 
the  exponential  behavior  of  the  approach  does  not  scale 
well  as  system  complexity  increases:  this  is  particularly 
disturbing  in  a  data  fusion  environment  where  the 
participating  databases  may  be  large.  Finally,  because  of 
its  low  level  of  abstraction,  the  approach  does  not  lend 
itself  to  reverse  engineering. 

A  more  promising  approach  to  constructing  UR  interfaces 
is  based  on  using  the  knowledge  captured  in  a  high-level 
semantic  data  model.  In  the  QUICK  system,  an  Extended 
Entity-Relationship  (EER)  model  is  used  [3,17,19]  for 
formulating  queries.  In  particular,  QUICK  segments  the 
EER  conceptual  schema  into  overlapping  subgraphs  of 
strongly  associated  conceptual  schema  objects,  and  then 
selects  appropriate  subgraphs  of  the  EER  conceptual  schema 
that  are  mapped  to  logical-level  constructs. 

By  operating  at  the  EER  level,  QUICK  overcomes 
many  of  the  limitations  that  hamper  the  maximal  object 
approach.  First,  by  using  EER  constructs,  conceptual 
database  design  knowledge  can  be  used  directly.  Second, 
though  certain  aspects  of  behavior  may  still  be  exponential, 
analysis  of  the  conceptual  schema  is  performed  over 
aggregate  objects  and  conceptual-level  constraints  are  applied 
to  reduce  the  time  it  takes  to  identify  strongly  associated 
sets  of  objects.  Finally,  the  approach  can  be  used  effectively 


in  a  reverse  engineering  environment.  In  particular, 
additional  EER  constructs  have  been  introduced  in  QUICK 
that  facilitate  reverse  engineering  a  conceptual  schema  from 
a  logical  schema. 

QUICK  is  able  to  generate  queries  efficiently  by 
preprocessing  an  EER  conceptual  schema  into  maximal 
acyclic  subgraphs  of  strongly  associated  objects.  These 
maximal  subgraphs  are  referred  to  as  contexts  [15,16]. 
The  theory  underlying  contexts  assumes  that  a  single 
database  is  being  modeled.  Yet,  it  is  often  the  case  that 
information  systems  are  composed  from  multiple  databases. 
Thus,  the  notion  of  contexts  must  be  extended  to  take  into 
account  possible  data  fusion  requirements.  In  many  cases 
(e.g.,  federated  database  systems  [18]),  it  is  not  feasible  to 
develop  a  unified  global  conceptual  schema,  as  the  individual 
databases  continue  to  exist  and  be  queried.  Thus,  context 
regeneration  over  a  global  schema  may  not  be  appropriate. 
In  such  cases,  a  designer  will  have  to  identify  explicit 
gateways  that  can  be  used  to  fuse  the  participating  databases 
and  thus  allow  global  querying.  These  gateways  often  can 
be  identified  when  the  participating  logical  schemas  are 
reverse  engineered  into  richer  conceptual  schemas. 

3.  Reverse  Engineering  from  a  Logical 
Schema 

As  information  systems  incorporate  multiple  databases, 
the  need  for  effective  data  fusion  becomes  essential. 
However,  fusing  databases  is  fraught  with  problems  relates 
to  disparate  logical  models,  data  redundancy,  and  potential 
inconsistencies.  To  facilitate  communication  among  system 
designers,  a  high-level  semantic  data  model  [7]  often  is 
chosen  into  which  logical  models  are  reverse  engineered. 
From  a  database  perspective,  the  EER  model  is  often  the 
target  model.  The  EER  model  is  independent  of  a  logical 
implementation  and  supports  mappings  to  a  variety  of 
logical  models  (e.g.,  relational,  network,  hierarchical,  and 
object-oriented).  Moreover,  when  the  reverse-engineered 
conceptual  design  is  completed,  it  can  be  enriched  with 
additional  knowledge  (e.g.,  constraints  that  can  be  used  for 
automated  query  formulation). 

Many  techniques  have  been  developed  for  reverse 
engineering  a  logical-level  schema  into  an  EER  concq)tual 
schema  [2,5,8,14].  For  example,  Batini  et  al.  [1]  describe 
a  process  for  converting  a  relational  database  schema  that 
entails  identifying  primary  relations  (i.e.,  relations  with 
primary  keys  that  do  not  contain  keys  of  other  relations), 
weak  primary  relations  (i.e.,  relations  with  primary  keys 
that  contain  primary  keys  of  other  relations),  and  secondary 
relations  (i.e.,  relations  with  primary  keys  that  are  the 
concatenation  of  primary  keys  of  multiple  relations).  Once 
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the  relations  are  classified,  they  can  be  mapped  into  HER 
objects.  For  example,  primary  relations  are  mapped  into 
entity  types,  weak  primary  relations  are  mapped  into  weak 
entity  types,  and  secondary  relations  are  mapped  into 
relationship  types.  Then,  the  designer  can  make  decisions 
regarding  inheritance  associations  among  created  EER 
objects,  and  can  further  customize  the  design  to  satisfy 
additional  system  constraints  that  may  exist 

Because  of  the  sparse  representation  at  the  logical  level, 
the  process  of  reverse  engineering  from  a  logical  schema 
into  a  conceptual  schema  cannot  be  fully  automated.  For 
example,  the  classification  of  relations  may  require  renaming 
of  candidate  and  foreign  keys  so  that  syntactic  matching 
may  be  performed.  Similarly,  without  explicit  inclusion 
dependency  constraints,  it  is  not  possible  to  identify 
inheritance  lattices.  Structural  constraints  on  relationship 
types  (e.g.,  cardinality  ratio  and  participation  constraints) 
can  sometimes  be  inferred,  but  must  often  be  provided  by 
the  designer.  These  problems  are  exacerbated  if  the  logical- 
level  design  is  not  in  an  appropriate  form  (e.g.,  relations 
should  be  in  at  least  third  normal  form),  as  it  is  difficult 
to  infer  nontrivial  dependencies  within  an  object 

When  multiple  databases  must  be  fused,  the  reverse 
engineering  process  is  further  complicated.  Issues  regarding 
data  tedundan<:y  and  consistency  across  databases  must  be 
resdved.  Furthermore,  decisions  must  be  made  regarding 
changes  to  existing  databases.  In  many  cases,  changes 
may  be  restricted  due  to  operational  constraints.  In  such 
cases,  the  reverse-engineered  model  must  preserve  the 
mapping  to  the  logical  level  while  representing  at  the 
conceptual  level  such  logical-level  limitations.  From  an 
automated  query  formulation  perspective,  this  means  that 
attribute  disambiguation  must  occur  across  databases  as 
well  as  within  a  single  database. 

QUICK  provides  substantial  support  for  reverse 
engineering.  For  example,  an  optimization  relationship 
type  has  been  introduced  that  enables  logical-level 
denormalization  decisions  to  be  represented  at  the  conceptual 
level.  In  particular,  a  long  path  of  EER  objects  may  be 
circumvented  by  a  short  path  that  can  produce  the  same 
result  Given  a  high-level  request  the  shorter  path  is  used 
in  lieu  of  the  long  path  if  all  of  the  requested  attributes  are 
covered.  On  the  other  hand,  if  intermediate  objects  are 
needed,  the  long  path  is  used.  Note  that  if  an  optimization 
relationship  type  were  modeled  as  a  basic  relationship 
type,  then,  syntactically,  the  conceptual  schema  wou'h 
imply  that  distinct  paths  connected  two  or  more  entities. 
Consequently,  given  a  request  involving  attributes  of  the 
connected  entities,  a  query  formulator  might  consider  a 
union  of  subqueries  to  be  appropriate  or  it  might  determine 


that  interaction  with  the  user  is  needed  to  refine  the  request 
before  the  final  query  can  be  generated.  However,  given 
the  semantics  of  an  optimization  relationship  type,  the 
query  formulator  can  infer  that  such  actions  would  be 
superfluous. 

QUICK  also  supports  meta-level  constructs.  For 
example,  a  meta-generalization  construct  is  provided  that 
allows  an  entity  type  instance  to  be  a  child  of  multiple 
instances  of  a  parent  entity  type.  For  example,  a  parent 
instance  concerning  some  type  of  target  may  contain 
information  uniquely  identified  by  the  key  attributes 

TARGETIDENTIFICATION,  DATE,  and  TIME.  In  tum, 

some  of  those  parent  instances  may  be  further  described  by 
a  child  instance  that  depends  only  on 
TARGET  IDENTIFICATION.  Note  that  Weak  entity  types 
could  not  be  used  in  this  case  because  the  actu^  child 
would  have  to  be  considered  a  parent  in  the  weak  entity 
type  association. 

4.  Description  of  U.  S.  Army  Databases 

The  U.  S.  Army  Research  Laboratory  (ARL)  is  directing 
a  program  designed  to  integrate  heterogeneous  databases  so 
that  high-level  requests  may  be  posed  by  users  unfamiliar 
with  the  underlying  conceptual  or  logical  designs  of  the 
participating  systems.  In  the  final  system,  a  wide  variety 
of  logical  models  will  be  employed  (e.g.,  relational,  inverted 
list,  si'mantic  network,  and  object-oriented).  However,  for 
illustrative  purposes,  only  portions  of  a  relational  database 
and  an  inverted  list  database  are  described  below.  To 
further  simplify  discussion,  only  basic  EER  model 
constructs  are  used. 

Figure  1  shows  a  portion  (i.e.,  approximately  10  percent) 
of  the  EER  conceptual  schema  for  the  ARL  Flecttonic 
Intelligence  ( ARL-ELINT)  relational  database.  To  avoid 
cluttering  the  diagram,  attributes  are  not  shown,  though  it 
should  be  noted  that  both  entity  types  and  relationship 
types  may  have  attributes.  ARL-ELINT  is  used  for  gathering 
and  correlating  low-level  electronic  and  imaging  intelligence 
information  (e.g.  tracking  a  mobile  radar  unit  via 
characteristics  such  as  pulse  width  and  operating  frequency). 
High-level  tactical  information  is  not  represented.  As  the 
ARL-ELINT  logical  schema  was  well-designed,  it  was 
straightforward  to  reverse  engineer  it  into  an  EFR  conceptual 
schema.  The  process  described  in  Section  3  was  used,  and 
only  minimal  changes  to  the  logical  schema  were  necessary 
to  produce  the  schema  shown  in  Figure  1 . 

Automating  query  formulation  for  ARL-ELINT  is 
straightforward.  First,  contexts  are  produced,  and  then 
high-level  requests  are  posed.  In  this  particular  case,  the 


EER  conceptual  schema  is  acyclic,  and  there  exists  only  a 
single  context.  Thus,  formulating  a  query  entails  pruning 
the  leaves  of  the  context  (i.e.,  the  conceptual  schema 
shown  in  Figure  1)  until  all  leaves  cover  requested  attributes. 
Then,  the  resulting  subgraph  is  mapped  to  a  set  of  relational 
schemas,  natural  joins  are  inferred,  and  the  final  query  is 
produced. 


Figure  1.  Portion  of  ARL-ELINT  database. 

As  an  example,  suppose  that  a  request  were  made  to 
list  the  known  operating  characteristics  (i.e.,  the  pulse 
repetition  hequency  [PRF],  the  radio  frequency  [RF},  and 
the  pulse  width  [PW])  of  electronic  device  abci234.  Using 
QUICK  directly  (i.e.,  without  a  higher  level  interface), 
this  request  could  be  expressed  as  follows: 

Select  prf, 
rf , 

pw 

Where  elnot_id  -  "ABC1234" 

The  notation  above  is  referred  to  as  USQL  because  it 
resembles  SQL,  but  assumes  a  universal  relation  view  of 
the  database  (i.e.,  it  is  not  necessary  to  qualify  attributes 


with  relation  information,  specify  a  from  clause,  or 
include  natural  join  criteria). 

Upon  receiving  the  request,  QUICK  identifies  the 
contexts  in  which  the  requested  attributes  reside.  As  Figure 
1  contains  a  single  context  (i.e.,  the  complete  conceptual 
schema),  QUICK  prunes  the  context  of  extraneous  leaves. 
In  particular,  the  following  EER  objects  are  pruned:  eob- 

Site,  Parent-Site,  Detected-Scan-Rate, and  ELINT- 
scan-Rate.  Note  that  while  attributes  are  not  shown  in 
Figure  1,  each  leaf  in  the  pruned  context  covers  at  least 
one  requested  attribute.  After  mapping  the  EER  objects  in 
the  pruned  context  to  logical  schema  objects  and  detemining 
a  natural  join  order  based  on  explicit  EER  associations, 
the  following  sequence  of  relation  schemas  is  produced: 


(t!lint_report  ,e3  int_rf  ,eiint_prf  ,elint_pw) 

At  this  point,  attributes  are  qualified  with  relation  and 
database  information,  natural  join  criteria  are  inferred,  and 
the  final  query  is  generated: 

SELECT 

arl_elint. .elint_prf .prf, 
arl_elint, .eiint_rf .rf, 
arl_elint .  .elint__pw.pw 

FROM 

arl_elint. .elint_report, 
arl_elint . .elint_rf , 
arl_elint . .elint_prf , 
arl_elint . .elint_pw 

WHERE 

arl_ellnt . .elint_report .elnot_id  = 
"ABC1234"  AND 

arl_elint . .elint_rf .elint_report_db_id  = 
arl_elint . .elint_report . 
elint_report_db_id  AND 
arl_elint . .elint_pr f . 

elint_report_db_id  = 
arl_elint . .elint_report. 

elint_report_db_id  AND 
arl_elint . .elint_pw.elint_report_db_id  = 
arl_elint. .elint_report . 
elint_report_db_id 

The  query  above  is  more  complicated  than  its 
corresponding  USQL  request  In  fact  in  complex  databases, 
it  is  not  unusual  to  generate  SQL  queries  that  are  more 
than  1(X)  lines  in  length  from  USQL  requests  that  arc  only 
a  few  lines  in  length.  It  should  also  be  noted  that  USQL 
was  not  designed  for  direct  use  in  an  interface.  Instead,  it 
was  designed  to  facilitate  the  construction  of  higher  level 
interfaces,  such  as  those  based  on  direct  manipulation  or 
natural  language.  Such  interfaces  accept  requests  expressed 
in  a  user-friendly  form,  convert  them  to  USQL,  and  pass 
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the  converted  requests  to  a  system  such  as  QUICK  for 
query  geiteiation.  Significantly,  as  a  result  of  the  capabilities 
provided  by  automated  query  formulation  systems,  interface 
designers  can  focus  on  user  interaction  instead  of  on  structural 
aspects  of  the  underlying  databases. 

Figure  2  shows  a  portion  of  the  Military  Intelligence 
Integrated  Data  System  and  Integrated  Database 
(MI^S/IDB),  which  is  an  inverted  file  database  that 
integrates  data  contained  in  the  Automated  Intelligence 
File  and  the  Defense  Intelligence  Order  of  Battle  System. 
While  all  three  systems  contain  data  to  support  command, 
control,  and  communications,  MIIDS/IDB  also  is  used  to 
support  war  planning  and  fighting  at  multiple  levels  (i.e., 
national,  theater,  tactical,  and  operational  levels). 

The  heavy  line  in  Figure  2  corresponds  to  one  of  the 
five  (overlapping)  contexts  that  comprise  the  MIIDS/IDB 
EER  concephial  schema.  Two  rules  were  used  to  include 
EER  objects  in  the  context  shown.  First,  any  entity  type 
that  can  be  functionally  determined  (via  n:i  and  i.-i 
relationship  types)  from  equipment-type  and  that  does 
not  introduce  a  cycle  at  the  EER  level  is  included  with  its 
associating  relationship  type.  This  follows  from  a  theorem 
in  reladoiuil  database  theory  which  says  that  if  one  relation 
functionally  determines  the  attributes  in  another,  then  the 
two  relations  can  be  natural  joined  in  a  lossless  manner 
[21].  Consequently,  a  strong  association  is  implied  and 
context  inclusion  is  justified.  Second,  the  recursive 
relationship  types  are  included  because  their  participating 
entity  types  are  included.  This  follows  from  the  fact  that 
any  recursive  relationship  type  relation  can  be  joined  in  a 
lossless  manner  with  its  participating  entity  type  relation, 
arxl  thus  is  strongly  associated  with  that  entity  type. 

Due  to  the  violation  of  various  relational  design 
guidelines,  reverse  engineering  the  conceptual  schema  of 
MIIDS/IDB  using  the  techniques  described  in  Section  3 
was  not  straightforward.  For  example,  attributes  were 
inappropriately  propagated  from  entities  to  relationships, 
thus  complicating  database  updates  and  violating  normal 
form  guidelines  based  on  functional  dependency 
specifications  [6,21].  In  addition,  some  relationships 
actually  represented  sets  of  relationships  with  variant 
primary  keys.  Thus,  entity  integrity  [6]  was  violated  as 
there  was  no  standard  primary  key  that  did  not  contain  null 
values  for  some  tuple. 

Because  of  the  problems  associated  with  the  logical 
schema,  a  decision  was  made  by  ARL  designers  to  abandon 
the  logical  schema  and  to  create  instead  a  more  idealized 
conceptual  schema.  However,  to  the  extent  possible,  an 
attempt  was  made  to  preserve  the  mappings  from  the  EER 


level  to  the  logical  level.  The  revised  conceptual  schema 
was  then  provided  as  input  to  QUICK,  which  performed 
context  analysis  and  generated  queries  in  response  to  high- 
level  requests. 


Figure  2.  Portion  of  MIIDS/IDB  database  with  one 
context  shown. 


It  was  after  several  queries  were  generated  that  the 
magnitude  of  the  problems  associated  with  the  MIIDS/IIB 
design  were  revealed.  In  particular,  the  formulated  queries 
did  not  correspond  to  the  likely  intent  of  a  typical  user. 
However,  when  analyzed  individually,  the  rationale  for 
QUICK’S  behavior  was  apparent  Consequently,  a  decision 
had  to  be  made  as  to  whether  the  general  set  of  contexts 
should  be  discarded  in  favor  of  a  manually  created  set  or 
whether  further  redesign  was  justified.  Upon  careful  analysis, 
it  became  apparent  that  the  EER  conceptual  schema  was 
flawed  and  should  be  revised.  After  several  iterations  of 
revision,  the  generated  queries  were  deemed  reasonable, 
implying  that  the  new  contexts  effectively  represented  sets 
of  strongly  associated  EER  objects. 

The  iterative  process  involved  in  the  redesign  of 
MIIDS/IDB  demonstrated  that  QUICK  can  play  a  valuable 
role  in  validating  either  a  new  design  or  a  reverse-engineered 
design.  In  particular,  an  EER  conceptual  schema  can  be 
given  to  QUICK  for  analysis.  Then,  typical  high-level 
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requests  can  be  posed,  and  the  generated  queries  can  be 
analyzed.  If  the  queries  are  deemed  inappropriate,  then  the 
conceptual  schema  is  evaluated  by  designers  and  modified 
accordingly.  The  process  iterates  until  no  more  changes 
are  necessary  or  until  the  designers  feel  that  no  more 
changes  are  justified.  In  the  latter  case,  the  generated 
contexts  can  be  manually  modified  so  that  the  resultant 
queries  are  consistent  with  the  likely  intent  of  typical 
users.  In  practice,  manual  context  modification  has  not 
been  necessary  for  well-designed  EER  conceptual  schemas. 

5.  Fusing  the  U.  S.  Army  Databases 

Though  logically  disjoint,  the  ARL-ELINT  and 
MIIDS/IDB  databases  are  conceptually  associated.  As  a 
result,  decision  makers  would  like  to  pose  requests  that 
integrate  data  from  the  two  systems.  Achieving  the 
iqipropriate  level  of  data  fusion  between  the  databases  was 
straightforward.  Specifically,  a  gateway  relationship  type 
was  introduced  that  explicitly  linked  site  in  MnDS/ira 
with  EOB-siTE  in  ARL-EUNT.  Figure  3  illustrates  this 
relationship,  and  identifies  the  explicit  natural  join 
specification  between  the  corresponding  entity  types.  Note 
that  the  elint-miids-site  relationship  type  is  virtual 
in  the  sense  that  its  only  purpose  is  to  serve  as  a  gateway 
for  database  fusion.  It  does  not  map  to  a  logical-level 
structure. 


Figure  3.  Gateway  relationship  type  between  ARL- 
ELINT  and  MIIDS/IDB  systems. 

With  the  gateway  relationship  type  in  place,  contexts 
were  regenerated,  and  queries  were  formulated  in  response 
to  high-level  requests.  In  particular,  it  became  possible  to 
generate  queries  that  requir^  data  fusion  between  low-level 
electronic  intelligence  information  and  high-level  tactical 
information.  For  example,  suppose  that  a  request  were 
made  concerning  the  last-known  activities,  mobility  status, 
and  equipment  types  and  quantities  of  any  known  enemy 
unit  associated  with  an  EOB  site  where  a  pulse  repetition 
frequency  (PRF)  of  50  Hz  has  been  detected.  In  USQL, 
the  request  could  be  expressed  as  follows: 

Select  site_name, 
unlt_name, 
unit_role. 


echelon, 

activity_name, 

mobility_status, 

equip_cocie, 

equip_quantity 

Where  allegiance  =  "ENEMY"  And 
prf  =  50 

In  turn,  the  generated  query  is  as  follows: 

SELECT 

miids. . site. site_name, 
miids. .unit .unit_name, 
miids. . unit . unit_role, 
miids . . unit .echelon, 
miids. .unit.activity_name, 
miids. .unit.mobility_status, 
miids . . equipment_type .6qulp_code, 
miids.  .equipment  .equlp__quantity 

FROM 

miids. .equipment, 
miids. .unit, 
miids. . site, 
arl_elint . .eob_site, 
arl_elint. ,eob_component, 
arl_elint . .elint_report, 
arl_elint . .elint_prf , 
miids. .equipment_type 

WHERE 

(miids.  init. allegiance  ■« 

"ENL.-iY"  AND 

arl_elint . .elint_prf .prf  - 
50)  AND 

miids .  .equipment  .unit_name  •= 
miids .. unit . unit_name  AND 
miids. .equipment .unit_role  = 
miids . .unit .unit_role  AND 
miids. .equipment .echelon  - 
miids .. unit .echelon  AND 
miids. .equipment .site_name  = 
miids. . s ite . s ite_name  AND 
arl_elint. .eob_site .eob_site_name  = 
miids .. site . site_name  AND 
arl_elint . .eob_component . 
eob_site_db_id  = 
arl_elint. .eob_site. 
eob_site_db_id  AND 
arl_elint. .elint_report.elnot_id  = 

arl_elint. .eob_component .elnot_id  AND 
arl_elint . .elint_prf . 
ellnt_report_db_id  = 
arl_elint . .elint_report . 
elint_report_db_id  AND 
miids . .equipment .equip_code  = 

miids. . equipment_type . equ ip_code 

The  resultant  query  is  quite  a  bit  more  complicated  than 
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its  corresponding  USQL  request,  and  demonstrates  the 
seamless  fusion  capabilities  supported  by  QUICK. 
Conceptually,  the  query  was  formulated  by  finding  the 
appopriaie  subgraphs  in  each  of  the  corresponding  databases 
(i.e.,  the  pruned  conceptual  schema  of  Figure  1  and  the 
pruned  context  shown  in  Figure  2),  explicitly  connecting 
the  found  subgraphs  via  the  gateway  relationship  type, 
mapping  to  the  logical  level,  inferring  join  criteria,  and 
generating  the  final  query.  As  desired,  the  user  did  not 
have  to  be  concerned  with  the  underlying  structures  of  the 
individual  systems,  and  did  not  have  to  be  aware  that 
multiple  systems  were  required  to  respond  to  the  request 

6.  Enhancing  the  Approach 

When  the  information  contained  in  distinct  databases  is 
disjoint  or  almost  disjoint  (e.g.,  as  was  the  case  with  the 
ARL-ELINT  database  and  the  MIIDS/IDB  database),  the 
process  of  query  generation  via  explicit  data  fusion 
specification  is  relatively  straightforward.  However,  in 
cases  where  a  large  numter  of  attributes  are  duplicated  or 
where  the  participating  databases  may  be  fused  via  multiple 
gateways,  altenuttive  tqiproaches  must  be  considered. 

When  a  large  number  of  duplicate  attributes  exist,  actions 
must  be  taken  to  resolve  potentially  conflicting  information 
and  to  ensure  that  consistent  results  will  be  produced.  As 
with  the  UR  rqrproach,  QUICK  does  not  allow  duplicate 
attributes,  and  warns  designers  if  such  attributes  are  found 
when  multiple  database  schemas  are  analyzed.  In  turn, 
designers  are  given  the  option  of  renaming  attributes  at 
the  conceptual  level  while  retaining  duplicate  logical-level 
names.  Consequently,  high-level  requests  that  specify  the 
appropriate  conceptual-level  attribute  will  result  in  a 
semantically  reasonable  logical-level  query. 

Requiring  knowledge  of  different  attribute  instances  calls 
for  some  sophistication  on  the  part  of  the  user.  This 
stems  from  the  fact  that  it  is  difficult  to  infer  an  appropriate 
instance  based  on,  for  example,  measures  of  nearness.  A 
promising  compromise  entails  listing  the  ambiguous 
attributes  with  brief  descriptions  so  that  the  user  can  make 
appropriate  selections.  However,  the  approach  requires  a 
sophisticated  interface  that  is  able  to  interact  intelligently 
with  users  about  the  meanings  of  different  attributes.  A 
somewhat  simplo'  approach  has  been  successfully  used  in 
the  StarView  intelligent  interface  developed  by  the  Space 
Telescope  Science  histitute  [17].  StarView  lists  ^1  attributes 
associated  with  relations,  and  allows  users  to  select  the 
desired  set  of  attributes.  Based  on  the  attributes  chosen, 
specific  attribute  instances  are  identified.  Given  this  set  of 
instances,  QUICK  is  able  to  select  contexts  unambiguously 
and  generate  the  final  query. 


When  a  large  number  of  gateways  exist,  cycles  may  be 
introduced  between  participating  databases.  As  a  result, 
context  regeneration  may  be  inappropriate,  as  context 
inclusion  rules  involving  cycles  can  change  the  sets  of 
strongly  associated  objects  in  an  individual  database.  Thus, 
in  the  general  case,  a  different  approach  is  needed  to  associate 
contexts  among  databases.  One  possibility  being  considered 
involves  introducing  meta-level  constructs  that  enable 
explicit  specification  of  gateway  relationship  types,  but 
tim  do  not  affect  individual  database  contexts.  A  significant 
advantage  to  this  approach  is  the  savings  in  processing 
time  required  for  context  generation.  The  ai^roach  is  also 
intuitively  appealing  as  it  retains  the  strong  associations 
found  within  individual  databases.  However,  issues 
regarding  query  disambiguation  arising  from  the  multiple 
paths  among  the  fused  databases  must  still  be  resolved. 

7.  Summary  and  Conclusions 

As  information  systems  grow  in  complexity,  the  need 
to  integrate  multiple  databases  is  often  required  and  effective 
data  fusion  becomes  essential.  In  this  paper,  we  have 
focused  on  how  retrieving  data  across  databases  is 
complicated  by  potential  inconsistencies  among  similar 
data  elements,  as  well  as  by  the  burden  placed  on  users  to 
be  familiar  with  the  underlying  logical  schemas  of 
participating  databases.  We  also  presented  a  system  known 
as  QUICK,  which  has  been  designed  to  counteract  some  of 
the  problems  discussed. 

QUICK  facilitates  reverse  engineering  of  logical  schemas 
into  knowledge-rich  EER  conceptual  schemas  and  supports 
automated  query  formulation.  By  using  an  EER  model, 
QUICK  is  not  dependent  on  a  particular  logical  model,  and 
can  be  used  for  unified  conceptual  design.  Furthermore, 
QUICK’S  ability  to  formulate  queries  efficiently  provides 
a  mechanism  for  validating  both  new  and  reverse-engineered 
conceptual  schemas.  QUICK’S  capabilities  were 
demtmstiated  with  portions  of  two  independent  U.  S.  Army 
databases,  and  it  was  shown  how  logically  disjoint,  but 
conceptually  related  database  systems  could  be  fused  and 
queri^.  In  addition,  enhanced  methods  that  employ  meta¬ 
level  conceptual  consuucts  to  support  reverse  engineering, 
data  fusion,  and  query  formulation  were  discussed. 

Currently,  QUICK  is  being  extended  to  handle  additional 
meta-level  constructs.  For  example,  consideration  is  being 
given  to  supporting  an  abstract  generalization  type  that 
realizes  its  existence  through  the  existence  of  its  children. 
Such  a  capability  would  allow  a  designer  to  specify  a  set 
of  related  entity  types  that  share  the  same  relational  structure, 
but  that  correspond  to  different  conceptual  entity  types. 
Semantic  query  optimization  techniques  are  also  being 
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developed  to  ensure  that  the  queries  generated  by  QUICK 
will  execute  efficiently.  Finally,  extended  gateway 
relationship  types  are  being  explored. 
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Obsolescence  of  electronic  components  in  military  Objectives 
systems  is  becoming  one  of  tbe  most  expensive  problems 

of  the  Department  of  Defense  (DoD),  costing  millions  of  The  DoD  has  set  as  its  goals  the  movement  firom 

doDais  a  year  in  cranponent  leen^neeiing,  special  orders,  physical  inventory  to  design-to-shelf  products.  The 

volume  buys  when  only  single  products  ate  required,  and  concept  of  design-to-sbelf  is  that  new  technology 

re-design  costs.  For  this  reason,  the  need  for  new  innovation  can  be  developed  and  prototyped  and  then  held 

techniques  to  reduce  the  maintenance  cost  of  older  systems  in  a  low  or  pre-production  state  until  tbe  need  is 

is  especially  important  as  fimding  is  reduced  and  weapon  established,  llie  availability  of  programmable  logic 

system  life  cycles  are  increased.  The  Standard  Hanh^  components,  such  as  Frogrammable  Array  Logics  (PALs), 

Acquisition  and  Reliability  Program  (SHARP)  and  the  Programmable  Logic  Arrays  (PLAs)  and  Field 

Flexible  Computer  Integrated  Manufacturing  (FCIM)  Programable  Gate  Arrays  (FPGAs),  provides  an  attractive 

program  office  put  together  the  Technology  Independent  vehicle  to  implement  this  methodology.  If  a  standard 

Rqxesentatimi  (tf  Elecbonic  Products  criRliP)  program  to  module  based  on  this  technology  were  available, 

address  these  problems.  The  main  purpose  of  the  TIREP  inventories  would  only  require  a  small  number  of  different 

program  is  to  develop  and  demonstrate  a  method  to  part  types,  which  would  be  personalized  for  specific 

recreate  an  equivalent  physical  manifestation  of  an  functions  as  needed.  Also,  commercially  available 

electronic  circuit  assembly  from  a  digital  fiinctioiuil  standard  components  which  provide  equivalent 

behavioral  description  such  as  the  Very  High  Speed  fimctionalily  could  be  procured  at  a  fraction  of  the  cost  of 

Integrated  Circuit  (VHSIC)  Hardware  Description  special  component  buys.  The  saving  would  be  both 

Language  (VHDL).  economical  and  compress  the  time  to  customer  for 

delivery  of  electronics.  This  crmcept  would  have  obvious 
In  this  paper  we  will  describe  tbe  efforts  of  tbe  TIREP  life  cycle  cost  advantages, 

program  in  recreating  form-fit,  electronic  equivalents  of 

several  Standard  Electronic  Modules  (SEMs).  Through  Modeling  the  subsystem  using  VHDL,  we  can  develop 

this  effort  we  intend  to  develop  a  methodology  for  a  technology  ind^ndent,  functional  representation  of  it 
redesigning  obsolete  components  and  subsystems  quickly  From  this  model,  we  can  then  synthesize  a  functionally 

and  cost  effectively,  using  VHDL  and  related  standards.  equivalent  circuit  that  is  "plug  compatible"  with  the  rest 

of  the  system,  using  whatever  technology  is  currently 
supported  by  Dr.  inghwn  Mack.  Office  of  available.  If  a  programmable  logic  de^dce  (PLD)  is  to  be 

Die  work  of  ewooei  from  NSWC  and  NAWC  was  funded  by  Dave  uscd,  the  VHDL  model  Can  even  generate  the  bit-stream  to 
Rsher,  SHARP  Program  office.  jxt)gram  the  PLD. 


A  major  objective  of  the  TIREP  Program  is  the 
documentation  of  the  business  rules  and  processes  required 
to  share  the  information  between  multiple  organizations. 
While  the  VHDL  standard  defines  the  representation  of  the 
behavioral  and  structural  information,  actual 
implementation  variation  or  styles  can  impede  successful 
use  of  this  standard  information.  The  development  of  a 
UserGuide  which  specifies  the  business  rules,  using  and 
structuring  the  various  information,  will  be  a  deliverable 
of  the  TIREP  Program.  This  User  Guide  will  assist 
organizations  in  developing  VHDL  compliant  datasets  that 
will  be  supported  through  the  life  cycle  of  the  product 

Another  important  requirement  of  the  TIREP 
Program  is  the  development  of  a  VHDL-to-manufacturing 
inter&ce.  The  design  or  stmage  of  VHDL  models  without 
the  ability  to  efficiently  transfer  the  infonnation  to  the 
manufacturing  facility  is  non-productive.  It  is  essential 
that  the  fimctional  d^inition  and  the  ]Aysical  constraints 
can  accurately  be  transferred  to  the  production  facility.  As 
part  of  the  development  of  this  interface,  the  TIREP 
Program  includes  ^e  actual  production  of  "format  A” 
Standard  Electronic  Modules  (SEM  ”A”s).  The  Rapid 
Acquisition  of  Manufactured  Parts  (RAMP)  Printed 
Wiring  Assembly  (PWA)  fecility  at  NAWC  -  AD  will  be 
the  primary  production  site. 

There  are  other  objectives  defined  for  the  TIREP 
Program  that  will  assist  in  the  development  of  a 
methodology  for  the  representation  and  storage  of 
technology  independent  electronic  products.  These 
objectives  include: 

*  Documenting  the  modeling  process 

*  Recommoidatioos  to  the  Repository  of  Electronic 
Components  Designs 

*  Virtual  prototyping  of  Electronic  Designs 

*  Improv^  representation  of  tuning  values  and  test 
vectors 

*  Dual-use  technology  transfer 

*  Integration  and  cooperation  with  other  programs  such  as 
GEM. 

Approach 

The  TTREP  Program  selected  thirteen  SEM  "A" 
modules  of  medium  to  high  obsdescence.  These  modules 
were  chosen  because  of  the  urgency  of  the  need  for 
replacement  parts  and  also  for  their  simplicity.  They  are 
purely  digital  and  range  in  complexity  from  2-input 
NAND  gates  to  basic  functional  units  such  as 
multiplexers  and  ALUs.  After  a  methodology  for  redesign 
is  developed  and  proved  by  the  manufacture  and  test  of 
these  smaller  parts,  the  same  procedures  can  be  applied  to 


other  larger  format  SEMs  and  commercial  format 
modules. 

The  TIREP  Program  emphasizes  useability  through 
the  use  of  industry  standards  such  as  VHDL  (TREE  Std. 
1076),  Waveform  and  Vector  Exchange  Specification 
(WAVES,  IEEE  Std.  1029.1),  EIA  567  Commercial 
Component  Specification,  the  VHDL  Data  Item 
Description  (DID,  DI-EGDS-80811),  and  the  multi-value 
logic  system  interface  package  (IEEE  Std.  1164).  If  a 
standard  set  of  rules  by  which  to  model  the  components  is 
not  established,  the  specification  is  open  to  a  variety  of 
interpretations  and  implementation  techniques.  By 
defining  a  standardized  modding  qrproacb  and  testing  the 
implementation  by  exchange  of  the  models,  these  omdels 
can  be  used  again  in  the  funire  to  synthesize  new 
components  arid  subsystems. 

In  order  to  maintain  a  technology  indqrendent 
representation  of  the  SEM,  the  models  were  developed 
along  the  guidelines  of  EIA  567.  This  implmentation 
requires  a  functional  "core"  model  with  electrical, 
physical,  and  timing  characteristics  defined  in  VHDL 
packages.  These  packages  are  wraf^red  around  the  core 
model  and  values  are  passed  down  tom  the  top  "board 
level"  as  generics.  Therefore,  the  "core"  model  remains 
purely  fimctional  and  can  be  used  in  the  synthesis  of 
subsequent  components.  The  EIA  567  standard  packages 
contain  definitions  of  otgect  types  to  facilitate  assigning 
DC  parametrics  and  timing  information.  The  logic  levels 
are  defined  in  IEEE  Std.  1164,  the  nine-level  logic 
package,  and  the  testbench  and  test  vectors  are  written  in 
IEEE  Std.  1029.1,  the  WAVES  dataset  packages. 

After  the  VHDL  models  were  written,  they  were 
exchanged  among  members  of  the  group  for  review, 
simulation,  and  evaluation.  This  was  done  to  voify  the 
information  was  modeled  correctly,  portability  between 
different  VHDL  toolsets  was  upheld,  and  to  evaluate  any 
differences  in  the  way  component  modeling  was 
interpreted  or  approached.  Each  model  was  evaluated 
using  a  common  "measuring  stick"  which  was  develqred 
using  criteria  required  by  the  DID  (DI-EGDS-80811)  and 
EIA  567.  Points  of  discrepancy  and  misunderstaiKUng 
with  the  models  were  raised  as  issues  and  discussed  in 
order  to  determine  the  procedure  the  "standardized"  models 
are  going  to  use.  This  paper  will  address  those  issues 
later. 

Once  the  VHDL  models  are  revised  to  reflect  the 
standard  methodology  by  which  to  model  components,  the 
"core"  model  will  be  used  to  synthesize  the  new 
component.  This  synthesis  may  result  in  a  new  module 
which  implements  a  one-for-one  replacement  of  the  IC’s, 


201 


if  ICs  can  be  found  that  implement  tbe  functionality, 
timing,  and  DC  parametrics  of  tbe  obsolete  component 
One  of  the  major  problems  with  redesigning  a  component 
into  a  new  technology,  however,  is  matching  tbe  timing 
constraints.  If  this  is  a  concern,  then  the  designer  may 
want  to  consider  a  PLA,  PLD,  or  FPGA  that  can 
implement  tbe  correct  timing  delays. 

The  proof  of  our  efforts  in  developing  a  methodology 
for  redesign  of  obsolete  parts  is  going  to  lie  in  the  actual 
production,  and  then  test,  of  tbe  new  parts.  Upon 
completion  of  the  validation  effort,  the  models  will  be 
provided  to  the  Rapid  Acquisition  of  Manufactured  Parts 
(RAMP)  faciliQr  at  NAWC,  Indianapolis  for  manufacture. 
We  have  only  begun  to  investigate  the  design-to- 
manufacture  interface.  At  this  point  we  have  provided  tbe 
RAMP  facility  with  a  complete  parts  list  included  on  the 
new  module.  In  addition  to  the  list  of  new  components, 
this  includes  the  frame  and  cormector  numbers  from  die 
old  part  in  order  to  maintain  the  same  fonn-fit  as  the  old 
part 

A  summary  of  this  process  is  shown  in  figure  1. 


A  standard  convention  for  VHDL  file  names  and  directory 
structures  needed  to  be  established  in  order  to  interface 
with  the  RAMP  Product  Data  TranstaUon  System 
(RPTS).  Tbe  following  are  the  VHDL  file  conventions 
which  are  recommended  for  use  with  tbe  subject  databases. 

*  All  VHDL  flics  shall  reside  in  the  "VHDL"  subdirectory 
of  a  specified  job.  It  is  recommended  that  there  be  no 
addititmal  directory  structure  under  "VHDL",  although  this 
is  not  prohibited.  It  has  been  noted  that  directory 
structures  can  complicate  file  transfers  between  ooitputer 
systems. 

*  There  shall  be  no  constraint  on  the  number  of  fdes 
present  in  tbe  "VHDL"  directory. 

*  A  VHDL  model  file  shall  be  named  according  to  tbe 
following  convention.  Tbe  <filenaine>  shall  be  generated 
by  the  design  engineer  and  should  be  con^teot  with  the 
naming  conventions  for  VHDL  identifiers.  Tbe 
<filename>  should  not  exceed  8  characters  in  length.  Tbe 
<extension>  is  a  3  character  file  extension  as  defined 
below. 

<filename>.<extension> 

*  All  files  contained  in  the  "VHDL"  directory  shall  have 
one  of  tbe  following  file  extensions: 


DESIGN  MANUEACIURING  PRODUCTION  TEST 


MTS 


Summary  of  Design-to-Manufacture  Process 
forTIREP 


Figure  1 
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Extension  File  Type/Description 
•VHD  VHDL  source  file 

.WAV  VHDL  WAVES  source 

file.  It  is  not  require  that 
this  extension  be  used  for 
WAVES  files.  It  is 
acceptable  practice 
to  use  .VHD  for  these 
files. 

DAT  model  specific  data  file. 

Examines  of  data  files 
include  the  vetor  file  for  fae 
WAVES  testbench, 

oompooent  pfogranuning  file, 
etc. 

.TXT  An  ASCII  text  file  which 

contains  information  tdxxit 
VHDL  model,  it's  use  and 
sqiplicattoo. 

*  All  VHDL  model  files  sh^  contain  only  ASCII  text 

*  DI-EGDS-80811,  paittgraph  103  dictates  the  pieseooe  of 
7  or  more  text  files  (.TXT  extension)  which  contain 
model  content  infixmation.  These  files  shall  be  induded 
with  all  VHIX.  models.  It  is  icoommended  that  these  files 
be  named  as  follows: 

FILE_LTXT 

FILE_2.TXT 

FILE_3.TXT 

etc. 

Issues  Encountered  During  VHDL  Model 
Evaluations 

Data  Sheet  Requirements 

One  of  the  modeling  issues  that  was  addressed  during 
evaluations  of  the  VHDL  models  was  the  way  DC 
parametrics  and  other  Mil-Spec  Data  Sheet  requirements 
are  applied  to  the  functional  model.  We  found  limitations 
with  ^  way  VHDL  can  model  these  requirements,  thus 
making  it  impossible  to  write  a  true  elec^nic  datasheet 
(EDS)  for  the  SEM  "A"  boards  using  VHDL.  It  was 
decided  that  DC  parametrics  would  be  c^tured  using  die 
EIA  567  electrical  view  aiqiroacb.  This  will  provide  a 
standardized  modeling  approach  which  may  be  u^  in  the 
development  of  customized  software  directed  at  the 
generation  of  DC  parametric  test  and  evaluation.  Also,  in 
accordance  with  EIA  567,  voltage  and  temperature  levels 
out  of  range  will  be  flagged  with  assertion  statements. 
Otho' requirements  that  can't  be  modeled  in  VHDL,  such 
as  isolation  and  output  crosstalk,  will  be  documented  in 
the  VHDL  software  with  comments. 


Power  and  Ground 

The  representation  of  power  and  ground  is  another 
problem  we  addressed.  It  was  agreed  that  power  and 
ground  need  to  be  re]»esented  for  netlisters  and  for  accurate 
documentation  of  the  part  Synthesis  tools,  however, 
have  different  methods  of  implementing  power  and 
ground.  Since  one  of  our  goals  is  the  VHDL-to- 
manufacturing  interface,  which  includes  synthesis,  we 
decided  not  to  include  power  and  ground  in  the 
synthesizable  "core"  model  Power  and  ground  may  be 
implemented  in  the  component  model  at  the  discretion  of 
the  design  engineer  if  functionality  of  the  component 
depends  on  it.  In  this  case,  however,  power  and  ground 
should  still  be  represented  as  a  structural  shell  built  around 
the  synthesizable  core  mottel.  In  all  other  cases,  power 
and  ground  should  be  included  as  nets  on  the  printed 
circuit  assembly  (PCA)  model  and,  when  apprcqxiate,  as 
pins  on  the  PCA  connector.  Implementation  of  power 
and  ground  in  the  WAVES  testbench  has  yet  to  be 
determined  and  may  rely  on  software  development  to 
generate  test  programs  from  the  testbench. 

Decomposition  of  Component  for  Modeling 

The  architecture  of  the  ’oore"  VHDL  model  should  be 
logically  modeled  as  a  decomposition  of  functional 
blocks,  in  order  to  maintain  technology  indepeixienoe.  In 
some  cases,  this  decomposition  may  result  in  an  actual 
correlation  to  the  decomposition  of  components  on  the 
board  and  component  level  models  may  be  used.  The 
designer,  however,  is  not  constrained  by  component 
boundaries  when  develojnng  the  model 

If,  at  the  discretion  of  the  designer,  the  component  or 
subsystem  is  modeled  using  a  structural  architecture  that 
breaks  down  to  models  of  actual  technology  dependent, 
physical  components,  this  model  must  be  used  to 
establish  a  baseline  against  which  the  redesign  will  be 
evaluated.  A  VHDL  model  that  documents  the  new  design 
must  also  be  developed  and  should  be  technology 
independeni  whenever  possible,  to  facilitate  synthesis 
into  future  technologies.  When  the  model  of  the  original 
design  is  used  only  as  a  develqnnent  tool  and  may  not  be 
rqxesentative  of  the  redesigned  circuii  it  is  not  necessary 
that  this  model  be  compliant  with  the  DID,  DI-EGDS- 
80811.  The  new,  potentially  synthesizable  VHDL  nmdel 
that  is  developed,  that  also  serves  as  documentation  of  the 
redesigned  part,  should  follow  all  the  standards  and 
guidelines  we  have  listed,  including  DI-EGDS-80811,  as 
closely  as  possible. 

Each  component  or  subsystem  that  is  modeled  in 
VHDL  should  be  accompanied  by  a  high  level  behavorial 
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model  for  future  redesign  efforts.  However,  limited 
documentation  or  component  complexity  may  limit  the 
design  engineer's  ability  to  develop  high  level  behavorial 
models  within  a  reasonable  amount  of  time  and  effort 
Therefore,  it  is  decided  that  development  of  a  high  level 
behavorial  model  should  be  addressed,  by  the  design 
engineer,  on  a  case  by  t  js  basis. 

Testbenches  and  Test  Vector  Sets 

Test  benches  and  test  vector  sets  were  developed  using 
the  WAVES  standard.  Test  benches  should  be  provided 
with  at  least  two  test  vector  files.  The  first  one  should 
validate  a  good  model  that  is  error  free.  The  second  vector 
file  should  be  seeded  witb  errors  to  test  that  the  model 
flags  the  proper  assertion  statement  upon  receiving  an 
error  and  to  test  for  proper  functional  toleration.  The 
"error  seeded"  vector  file  ^ould  be  adequately  documented 
with  comments  to  indicate  which  error  conditions  it  is 
testing. 

Unresolved  Issues 

During  the  evaluation  of  the  VHDL  models,  several 
issues  were  brought  up  that  remain  unresolved. 

*  The  implementation  of  power  and  ground  in  the 
WAVES  testbench  has  yet  to  be  determined  and  may  rely 
on  software  development  to  generate  test  fKograms  from 
the  testbench. 

*  Guidelines  must  be  established  to  detail  the  maimer  in 
which  multiple  timing  constraints  on  a  single  pin  will  be 
handled. 


*  In  order  to  meet  the  requirements  of  DI-EGDS-8081 1, 
nominal  component  delay  values  may  be  modeled  as  the 
average  of  the  minimum  and  maximum  delay  values, 
when  known.  In  those  instances  where  only  the 
maximum  delay  is  specified,  it  will  be  necessary  to 
generate  a  recommended  approach  for  determining  the 
minimum  and  nominal  delays. 

Conclusion 

TIREP  was  est^lished  to  address  the  cosdy  issue  of 
component  obsolescence  in  military  systems.  By  using 
industry  standards  such  as  VHDL  and  WAVES  in  the 
redesign  of  obsolete  components  we  not  only  make  full 
use  of  synthesis  tools  that  are  available  today,  but  we  also 
address  problems  of  future  obsolescence.  The  efforts  of 
TIREP  in  developing  a  set  of  business  rules  and 
processes  with  which  to  represent  “standardized” 
behavioral  and  structural  information  and  develop  a  design- 
to-manufacture  interface  is  not  complete.  VHDL 
modeling  guidelines  are  currently  in  the  process  of  being 
developed  and  documented.  Witb  the  interlaces  that  exist 
between  VHDL  and  PLAs,  PLDs,  and  FPGAs,  systems 
can  be  redesigned  with  greater  part  redundancy,  resulting  in 
inventories  with  smaller  number  of  part  types. 
Eventually  witb  the  shift  to  programmable  logic 
components,  we  may  even  see  component  consolidation 
in  systems  which  would  not  only  gr^y  reduce  ledesign- 
for-obsolescense  costs,  but  may  as  well  reduce  system 
failure  rates  as  the  number  of  fallible  components 
decreases. 
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This  paper  introduces  the  Center  for  Information 
Management  (CIM)  Software  Systems  Reengineering 
Process  Model  [8]  and  explores  its  use  in  planning  and 
implementing  Navy  reengineering  efforts.  The  Model 
provides  guidance  for  reengineering  automated 
information  systems  within  the  Department  of  Defense 
(DoD).  This  Model  addresses  many  issues  concerning 
large-scale,  complex  systems  which  impact  Navy  real-time 
system  modernization  efforts.  The  intended  audience  for 
the  Model  is  any  organization  within  DoD  tasked  to 
reengineer  information  systems,  however.  Navy  efforts  to 
define  a  reengineering  process  model  for  real-time 
embedded  systems  may  find  the  Model  a  useful  starting 
point. 

Background 

The  purpose  of  the  CIM  Software  Systems 
Reengineering  Process  Model  is  to  capture  the  essence  of 
software  reengineering  as  it  applies  in  the  DoD 
Information  Management  (IM)  community.  Two  broad 
concepts  guide  software  reengineering  for  DoD  IM.  The 
first  concept  is  the  prevention  of  duplication  by  joint  use 
of  personnel,  information  systems,  facilities,  and  services 
across  DoD.  For  the  IM  community,  multiple  software 
systems  often  provide  similar  services  which  may  be 
consolidated  and  reduced  to  a  single  system  through 
.software  reengineering.  This  boli.stic  view  of  rcu.se  docs 
not  apply  to  Navy  embedded  .systems  for  which  a 
reduction  in  duplication  at  the  comjxincnt  level  drives 


reuse  activities.  Utilization  of  common  low-level  system 
components  minimizes  redundancy  in  the  embedded,  real¬ 
time  system  domain  and  provides  cost-benefits  in  terms  of 
maintenance  and  system  development.  Reuse  and  domain 
analysis  programs  within  various  Navy  organizations  arc 
addressing  these  issues.  The  second  concept  guiding 
software  reengineering  is  conformance  to  new  regulations, 
policies,  standards,  and  guidelines  for  software  acquisition 
and  support.  This  concept  also  drives  most  Navy 
modernization  efforts  today.  Such  standards  include  using 
a  common  programming  language  (MIL-STO- 1 8 1 5  A  Ada 
Programming  Language),  moving  towards  open  software 
systems  (FIPS  146-2  Government  Open-Systems 
Interconnection  Protocol),  and  maintaining  compliance 
with  a  common  operating  system  interface  (FIPS  151-1 
Portable  Operating  System  Interface  Exchange,  POSIX). 
Guidelines  include  integrating  Commercial  OfT-the-Shclf 
(COTS)  products  where  possible,  including  Computer- 
Aided  Software  Engineering  (CASE)  tools. 

Model  Overview 

The  CIM  Software  Systems  Reengineering 
Process  Model  is  represented  using  the  IDEFo  method  [3, 
p58].  IDEF(  is  used  to  produce  a  stnicturcd  rcprc.sentaiion 
of  activities  or  functions  and  the  relationship  between 
those  activities.  IDEFn  models  arc  comixiscd  of  activities 
and  interfaces,  including  inputs,  controls,  outputs,  and 
mechanisms  (Fig.  1). 
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Fig.  1  IDEF  Activity  Model 


The  Model's  context  diagram  provides  a 
framework  for  reengineering  by  identifying  the  interfaces 
to  extemal  activities  (Fig.  2).  The  names  of  these 
interfaces  and  their  sponsoring  activities  reflect  the 
Model's  intended  use  in  the  IM  community.  Comparable 
activities  constrain  and  guide  the  reengineering  process  for 
the  Navy.  In  this  paper,  the  names  of  these  interfaces 
have  been  modified  to  facilitate  the  Model's  use  in  Navy 
applications.  The  original  IM  names  are  listed  in 
parenthesis  immediately  following  the  modified  names  for 
reference  when  applicable. 

Inputs 


Activities  are  represented  as  boxes  and  the 
interfaces  are  depicted  as  arrows,  entering  and  leaving  the 
boxes.  Inputs  enter  from  the  left  and  outputs  leave  from 
the  right  of  the  box;  the  activity  transforms  inputs  into 
outputs.  Controls  enter  at  the  top  of  the  box;  they 
provide  direction  and  constraint.  Mechanisms, 
representing  the  means  used  to  perform  the  activity,  enter 
from  the  bottom.  Activities  and  their  relationships  are  not 
related  to  or  limited  by  time. 

Software  Systems  Reengineering  Process 

The  CIM  Software  Systems  Reengineering 
Process  Model  describes  the  activities  supporting  the 
development  and  maintenance  of  automated  information 
systems  based  on  the  examination  and  utilization  of 
existing  software  system  resources.  The  guidance 
provided  by  the  Model  for  reengineering  information 
systems  is  applicable  to  Navy  reengineering  efforts.  Two 
differences  between  reengineering  automated  information 
systems  and  Navy  embedded,  real-time  applications  are 
(1)  the  level  of  effon  spent  in  each  of  the  reengineering 
activities,  and  (2)  the  methodologies  and  tools  supporting 
these  activities.  Since  the  CIM  Model  docs  not  specify 
levels  of  effort  nor  identify  specific  methodologies  and 
tools,  this  model  may  be  useful  in  planning  the 
reengineering  of  embedded,  real-time  applications.  The 
CIM  Model  provides  a  framework  upon  which 
information  on  specific  methodologies,  tools,  and  case 
studies  presented  during  the  Systems  Reengineering 
Technology  Conference  can  be  integrated. 


The  inputs  to  the  reengineering  activity  include 
the  New  Requirements  (Business  Requirements),  Existing 
System  (.Automated  Information  System),  Feasibility 
Analysis  Results,  and  Reusable  Assets.  The  New 
Requirements  are  optional  and  specify  new  functionality 
for  the  known  existing  system.  The  Existing  System  will 
be  reengineered.  Feasibility  Analysis  Results  summarize 
preliminary  studies  which  may  have  been  performed, 
including  a  cost/benefit  analysis,  technical  feasibility 
study,  and  a  risk  analysis.  Reusable  Assets  are  software 
woric  products,  including  source  code,  documentation, 
designs,  test  data,  and  specifications.  Reuse  components 
for  Navy  are  stored  in  a  DISA-sponsored  repository  for 
Navy  efforts  in  Washington  DC.  supported  by 
NAVCOMTELCOM.  Reusable  assets  arc  also  available 
from  NUWC,  NAVSEA,  and  FCDSSA,  Dam  Neck  VA. 

Controls 

Controls  on  the  reengineering  process  include  the 
DoD  Enterprise  Model  (4J;  Functional  Area  Models, 
Regulations.  Policy,  Standards,  and  Guidelines.  Resource 
Limitations;  Technical  Architectures;  and  Available 
Reengineering  Technology.  Many  of  these  controls 
primarily  impact  automated  information  systems. 
However.similar  guidelines  mayexi.st  for  embedded,  real¬ 
time  systems.  The  DoD  Enterprise  Model  provides  the 
high-level  vision  of  the  mission  area  for  the  reengineered 
system,  while  Functional  Area  Models  govern  the  business 
domain  in  which  this  system  will  operate.  For  embedded, 
real-time  applications  functional  area  models  would  define 
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the  rules  governing  a  class  systems.  Regulations,  Policy, 
Standards,  and  Guidelines  are  the  documents  containing 
the  principle  rules  designed  for  governing  and  influencing 
decisions  and  actions  during  software  engineering 
activities  included  in  reengineering.  Resource  Limitations 
scope  the  reengineering  project  by  estimating  the 
manpower,  funding,  schedule  deadlines,  and  computer 
resources  available.  These  limitations  are  usually  imposed 
on  the  organization  performing  the  reengineering  by  an 
external  sponsoring  organization.  Technical  Architectures 
describe  the  computing  and  communications  environment 
in  which  the  Reengineered  System  must  execute.  For  the 
Navy,  this  control  would  be  synonymous  with  a 
description  of  the  target  environment  for  the  system. 


Available  Reengineering  Technology  idcnnlics  projxr.scr! 
methodologies  and  tools  available  for  reengineering  The 
technology  available  for  reengineering  automatet.1 
information  system  is  well-publicized  and  commercially 
available.  For  Navy  applications,  the  available 
reengineering  technology  is  often  program-specific  and 
difficult  to  identify.  Naval  Sea  Systems  Command 
contains  a  database  of  available  tools  for  supixyrting  CMS- 
2  and  AN\UYK  series  computers.  Reengineering 
technology  development  programs  in  the  Navy,  include 
the  Engineering  of  Complex  Systems  program  at  NSWC. 
Otheis  include  China  Lake  (assembler  issues),  NUWC 
(reuse  projects).  NRaD,  SBlRs,  NAVSEA,  FCDSSA,  and 
the  Naval  Information  System  Management  Center 
(NISMC). 


Fig.  2  Reengineering  Context  Diagram 
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Outputs 

The  rccngincerini;  effort  produces  Candidate 
Reuse  Assets,  Recommended  Changes  to  Controls,  and 
the  Reengineered  System.  Candidate  Reuse  Assets  are 
produced  and  sent  to  a  reuse  certification  program.  These 
assets  include  design  models,  system  specifications,  metric 
data,  data  models,  process  models,  design  decisions,  and 
test  procedures.  Other  assets  include  reengineering 
strategies,  improved  .  intenance  procedures,  and  new 
business  practices.  As  experience  in  reengineering 
increases,  the  lessons  learned  will  improve  the  process. 
The  impact  of  integrating  new  technical  architectures 
during  the  reengineering  activity  and  adhering  to  new 
tegulations,  policy,  standards,  and  guidelines  during 
reengineering  may  promote  new  reengineering  strategies. 
The  Reengineered  System,  generated  from  activities 
described  within  the  Model,  consists  of  the  application 
software,  data,  technical  infrastructure,  and  all  as.sociated 
documentation. 

Mechaiiisnis 

Reengineering  is  supported  by  Tools, 
Repositories,  the  Project  Team,  Methodologies,  and  the 
Technical  Infrastructure (Computingand Communications 
Infrastructure).  Tools  are  automated  mechanisms  used  to 
improve  productivity  in  software  reengineering.  Ti  e 
ty|x:s  of  tools  needed  for  software  reengineering  include 
project  management,  restructuring,  reverse  engineering, 
and  forward  engineering  tools.  Reverse  engineering  tools 
include  souicc  code  analyzers,  design  recovery  and 
rcdocumentation  tools.  Forward  engineering  tools  include 
code  generators,  requirements  analysis,  design  support 
tools,  test  case  generators,  and  integration  support  tools 
The  tools  available  for  sup|X)rting  automated  information 
systems  and  Navy  embedded  systems  are  very  different 
due  to  the  programming  languages  and  hardware 
platforms  uixtii  which  these  tools  operate.  The  h  1 
domain  has  many  tools  which  arc  commercially  available 
to  supixtn  the  predominant  languages  and  hardware 
platforms  for  both  development  and  execution  of 
automated  information  systems.  There  arc  very  few 
commercially  available  tiKtls  which  suppon  reengineering 
Navy  embedded,  real-time  systems.  Many  tools  have 


been  developed  for  specific  Navy  projects  and  idcniil  s  iny 
them,  their  capabilities,  and  acquiring  tlicin  tor  use  on 
Navy  systems  other  than  the  ones  for  winch  they  wcic 
developed  can  be  very  difficult.  Rccnginccnnu  suppon 
for  Navy  embedded  real-time  systems  is  primarily  a 
manual  effort,  integrated  with  CASE  tools  CMS-2  U' 
Ada  translators  include  those  developed  by  APL,  CCCC, 
MITRE,  and  NRaD 

Repositories  are  mechanisms  for  storing  and 
retrieving  information  or  reusable  assets  Examples  ot 
Tool  Repositories  include  those  maintained  by  ECS, 
NAWC-China  Lake,  SPC,  and  STSC.  Repository-based 
technology  may  also  be  used  to  store  and  retrieve 
information  generated  during  reengineering,  including 
system  components.  Reverse  Engineered  Products, 
components  of  the  Reengineered  System,  repons  defining 
Available  Reengineering  Technology,  and  Reusable 
Assets. 

The  members  of  the  Project  Team  for  IM  include 
reengineering  experts,  maintainers,  functional  personnel, 
and  the  users  of  the  system.  For  the  Navy,  the  Project 
Team  would  primarily  consist  of  the  maintainers  and  the 
users  of  the  systems.  The  Project  Team  should  include 
experts  in  the  following  areas:  softwarc/system 
engineering,  technical  infrastructure,  function/mission  ol 
the  system  domain,  application  software  usage,  and 
reengineering  technology.  Matching  skills  with  the 
activities  described  in  this  moticl  insures  prtxiuctivity  and 
minimizes  risk.  The  users  of  the  system  should  be 
involved  with  the  Project  Team  throughout  the 
reengineering  effort.  Methodologies  arc  systems  of 
principles,  procedures,  and  practices  applied  to  the 
development,  operation,  reengineering  and  supjxtrt  of  a 
software  system. 

Methodologies  for  reengineering  include  reverse 
and  forward  engineering  methodologies.  These 
Methodologies  support  a  variety  of  software  engineering 
activities,  which  should  be  investigated  to  minimi/e 
impact  on  the  sponsoring  organization's  SEE 
Reengineering  mct’.:odologics  for  Navy  embedded 
rcdl-timc  systems  are  primarily  informal,  posjcei- 
dej^ndent,  and  devised  by  the  prime  contractor 
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The  Technical  Infrastructure  is  the  environment 
in  which  the  reengineered  system  operates.  The  IM 
infrastructure  is  a  service  utility  that  provides  common 
shared  computing  and  communications  capabilities, 
including  data  bases,  common  networks,  electronic 
messaging,  and  computing  platforms.  For  Navy  systems, 
testbed  facilities  are  used  to  insure  that  the  system  will 
operate  when  deployed. 

Reengineering  Process  Activities 

Software  reengineering  is  composed  of  three 
major  aaivities:  Define  Project,  Reverse  Engineer,  and 
Forward  Engineer  (Fig.  3).  The  Model  diagrams  which 
refine  these  activities  are  not  included  in  this  paper,  but 
ate  in  the  CIM  Software  Systems  Reengineering  Process 
Model  [8]. 

Define  Project 

The  Project  Team  defines  the  Reengineering 
Project  Plan  which  serves  as  the  controlling  document  for 
the  reengineering  project.  The  Project  Plan  should  be 
flexible  enough  to  handle  modifications  based  on  results 
of  both  the  reverse  and  forward  engineering  activities. 
Analysis  Deliverables  from  forward  engineering  may 
provide  information  about  the  New  Requirements  which 
impact  the  Project  Plan.  Reverse  engineering  provides  a 
complete  understanding  of  the  existing  system  which  may 
also  impact  the  reengineering  effort.  Define  Project  is 
composed  of  the  following  activities:  Define  Objectives, 
Identify  Baseline,  and  Define  Reengineering  Project  Plan. 

The  Objectives  identify  the  organizational  goals 
of  the  reengineering  effort,  including  objectives  for  using 
the  system,  supporting  the  system,  and  applying 
reengineering  technology.  The  objectives  for  using  the 
system  include  performance  issues  and  user  interface 
requirements.  Improvements  in  the  maintenance  process 
and  extending  the  life  expectancy  of  existing  systems  are 
typical  objectives  for  supporting  the  system.  The 
objectives  of  applying  reengineering  technology  include 
proofs-of-concept,  proofs-of-utility,  and  identification  of 
risks  which  might  impede  the  reengineering  process. 


The  Objectives  may  change  based  on  the  scope 
of  the  reengineering  effort.  Recommended  Changes  to 
Objectives  may  result  from  the  Available  Reengineering 
Technology,  Resource  Limitations,  and  New 
Requirements.  Development  of  concrete  measurable 
Objectives  is  an  essential  step  in  establishing  the 
Reengineering  Project  Plan. 

The  Project  Team  will  identify  the  configuration 
items  which  comprise  the  current  System  as  the  Baselined 
System.  These  items  include  the  application  software, 
data,  technical  infrastructure,  and  all  associated 
documentation.  The  Baselined  System  will  not  undergo 
any  modifications  outside  the  scope  of  the  reengineering 
project.  The  activity  of  identifying  the  baseline  does  not 
include  the  analysis  of  any  configuration  items,  but  simply 
identifies  the  system  upon  which  the  reengineering 
activities  will  be  performed.  The  Objectives  may  control 
the  identification  of  the  Baselined  System  by  requiring  the 
reengineering  of  a  specific  version  or  the  consolidation  of 
multiple  versions  of  the  same  system. 

The  Define  Reengineering  Project  Plan  activity  is 
performed  in  four  parts,  including  Develop  Reengineering 
Strategy,  Identify  Methodologies  and  Tools,  Allocate 
Resources,  and  Develop  Reengineering  Project  Plan. 

The  reengineering  strategy  identifies 
reengineering  alternatives  for  incorporating  new 
technology  and  approaches,  and  the  use  of  methodologies 
and  tools.  Possible  alternatives  include  restructuring  and 
redocumentation.  The  alternatives  are  evaluated  with 
respect  to  objectives,  risks,  impacts,  and  requirements. 
The  project  strategy  drives  the  identification  and 
utilization  of  methodologies  and  tools  for  software 
reengineering.  The  strategy  also  identifies  and  describes 
the  structure  of  the  products  expected  from  the 
reengineering  effort. 
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Fig,  3  Reengineering  System 
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Proposed  methodologies  and  tools  are  identified 
through  an  analysis  of  Available  Reengineering 
Technology.  The  proposed  methodologies  and  tools  must 
integrate  into  the  sponsoring  organization's  SEE  and 
support  the  activities  defined  in  the  project  strategy.  The 
Allocate  Resources  and  Develop  Reengineering  Project 
Plan  activities  may  require  revisions  to  the  proposed 
methodologies  and  tools  to  comply  with  the  controls  and 
New  Requirements. 

The  project  resources  ate  allocated  for  performing 
the  reengineering  project.  Project  resources  include 
personnel,  computer  resources,  tools,  and  the  necessary 
training.  These  resources  must  remain  within  the 
constraint  of  the  Resource  Limitations. 

A  structured  plan  is  developed  for  accomplishing 
the  reengineering  under  the  direction  of  Regulations, 
Policy,  Standards,  and  Guidelines.  The  plan  includes  the 
project  strategy,  project  resources,  and  selected  tools  and 
methodologies  for  implementing  the  reengineering. 
Revisions  to  the  project  resources  ate  based  on  constraints 
from  Regulations,  Policy,  Standards,  and  Guidelines,  or 
inconsistencies  between  the  project  strategy, 
methodologies,  and  tools. 

Reverse  Engineer 

The  Project  Team  analyzes  the  documentation, 
application  software,  data,  and  the  technical  infrastructure 
of  the  baselined  system.  This  analysis  is  performed  to 
identify  the  system  components  and  their 
interrelationships,  and  to  capture  these  components  in 
representations  that  provide  a  better  understanding  of  the 
system.  Reusable  Assets  should  be  used  to  compose  these 
representations  when  possible.  Any  new  representations 
of  these  components  should  be  designed  for  reuse  as 
Candidate  Reuse  Assets.  Reverse  Engineered  Products 
must  be  manageable  and  usable  in  the  subsequent  forward 
engineering  activities.  Reverse  Engineer  is  composed  of 
the  activities  called:  Analyze  Documentation,  Analyze 
Application  Software,  Analyze  Data,  Analyze  Technical 
Infrastructure,  and  Integrate  Extracted  Products. 


The  Analyze  Documentation  activity  analyzes 
existing  documentation  to  extract  a  system  specification, 
technical  infrastructure  capabilities,  and  system  design 
decisions. 

The  Analyze  Application  Software  activity 
analyzes  the  existing  application  software  to  extract  the 
software  specification,  functional  requirements,  metric 
data,  data  models,  process  models,  and  software  design 
decisions. 

The  Analyze  Data  activity  analyzes  the  existing 
data  to  extract  data  products  that  are  used  to  define  the 
design  model,  system  specification,  functional 
requirements,  metric  data,  data  models,  and  data  design 
decisions. 

The  Analyze  Technical  Infrastructure  activity 
analyzes  the  existing  technical  infrastructure  to  extract 
technical  infrastructure  products  that  define  its  capabilities 
and  related  design  decisions.  A  good  understanding  of 
the  existing  technical  infrastructure  is  imperative,  since 
many  reengineering  efforts  integrate  new  operating 
systems  and  hardware  platforms.  The  current 
infrastructure  provides  capabilities  to  the  existing  system 
which  must  be  captured  during  reverse  engineering  in 
order  to  have  a  complete  understanding  of  the  current 
system.  These  capabilities  may  still  be  supported  in  the 
Reengineered  System. 

The  Integrate  Extracted  Products  activity 
integrates  the  information  from  the  extracted  products  to 
form  the  Reverse  Engineered  Products  which  are  forward 
engineered.  Reverse  engineering  can  be  used  to  identify 
whether  any  New  Requirements  are  supported  in  the 
existing  system.  Recommended  Product  Revisions  are 
generated  when  an  inconsistency  is  detected  between  one 
or  more  of  the  extracted  products.  These  inconsistencies 
must  be  reconciled  as  part  of  the  reverse  engineering 
process. 

Forward  Engineer 

Within  the  context  of  reengineering,  forward 
engineering  is  the  software  engineering  activities  that 
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consume  the  products  of  reengineering  activities 
(primarily  reverse  engineering)  and  reuse,  along  with  new 
system  requirements  to  produce  a  target  system.  Forward 
Engineering  is  composed  of  activities  called  Analysis, 
Design,  Build,  Integrate,  and  Test. 

Reverse  Engineered  Products  are  input  to  the 
Analyze  and  Design  activities.  Reusable  Assets  should  be 
used  throughout  the  Forward  Engineer  activity  when 
possible.  The  Reengineering  Project  Plan;  and 
Regulations,  Policy,  Standards,  and  Guidelines  concerning 
application  software  development  guide  this  activity  by 
defining  the  structure  of  the  expected  components  and 
results.  Candidate  Reuse  Assets  may  be  generated  during 
the  Analyze,  Design,  Build,  and  Test  activities.  Each 
activity  produces  the  documentation  required  by 
applicable  standards,  including  DoD-STD-2167A,  DoD- 
STD-7935A,  and  the  proposed  MIL-STD-SDD.  TheDoD 
Enterprise  Model  and  Function  Area  Models  are 
employed.  The  life-cycle  management  is  guided  by  DoD 
7920.1  for  the  Life-cycle  Management  of  Automated 
Information  Systems. 

In  the  Analyze  activity,  the  New  Requirements 
and  the  Reverse  Engineered  Products  ate  analyzed  during 
this  activity  to  generate  the  Analysis  Deliverables.  The 
Analysis  Deliverables  include  requirements  for  the  Test 
activity  and  a  formal  specification  of  the  analyzed  New 
Requirements  addressed  in  the  existing  system.  The 
principal  Analysis  Deliverables  are  the  business  rules, 
system  specifications,  design  decisions,  and  test 
procedures. 

In  the  Design  activity,  the  Analysis  Deliverables 
and  the  Reverse  Engineered  Products  concerning  Design 
generate  the  Design  Components  during  this  activity. 
Design  Components  are  modules  representing  a  design  of 
the  system  parts  to  be  constructed  during  the  Build 
activity,  including  the  required  documentation 
summarizing  the  results  of  the  design  activity.  Design 
Results  confirm  that  the  Design  Components  have  been 
constructed  according  to  specification  or  they  request 
clarification  of  an  analysis  issue  needed  to  complete  the 
Design  activity.  The  principal  Design  Components  are 


the  design  model,  data  models,  process  models,  and 
design  decisions. 

In  the  Build  activity,  the  Design  Components  are 
used  to  generate  the  Build  Components.  Build  Results 
confirm  that  the  Build  Components  have  been  constructed 
or  request  clarification  on  a  design  issue  that  is  preventing 
the  completion  of  the  Build  activity.  The  Build 
Components  are  the  constructed  system  parts  to  be 
interfaced  during  the  Integrate  activity. 

In  the  Integrate  activity,  any  number  of  Build 
Components  are  combined  to  form  Integrated 
Components.  The  Integrate  activity  insures  the  interfaces 
between  Build  Components  are  correct  and  complete. 
Integration  Results  confirm  that  the  Build  Components 
have  been  interfaced  successfully  or  request  clarification 
on  an  interface  or  build  issue  that  is  preventing  the 
completion  of  the  Integrate  activity.  The  Integrated 
Components  are  the  interfaced  Build  Components 
representing  part  or  all  of  the  system. 

In  the  Test  activity,  the  Integrated  Components 
are  verified  using  a  test  plan  developed  from  the 
requirements  for  testing  defined  in  the  Analysis 
Deliverables.  Every  Build  Component  is  tested  according 
to  the  individual  component  specification.  The  Test 
Results  confirm  that  the  tests  were  successful  and  describe 
the  test  procedure.  The  principal  products  from  this 
activity  arc  the  tested  Reengineered  system  and  the  Test 
Results. 

Conclusion 

Ultimately,  the  process  of  software  reengineering 
must  support  the  high-level  goals  of  any  organization:  (1) 
eliminating  non-essential  products  and  processes;  (2) 
increasing  the  value  of  those  remaining;  and  (3)  increasing 
the  efficiency  through  streamlining,  simplification  and/or 
automation  [9].  Software  reengineering  technology 
provides  a  myriad  of  capabilities  which  support  a  variety 
of  software  engineering  activities. 

Future  plans  are  to  validate  the  CIM  Software 
Systems  Reengineering  Process  Model  by  applying  it  in 
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software  reengineering  efforts.  Previous  efforts  served  as 
the  framework  for  the  development  of  the  Model  and 
subsequent  efforts  will  prove  its  effectiveness.  The  Model 
represents  the  software  reengineering  process  from  the 
software  engineer's  viewpoint;  additional  models  currently 
under  development  at  CIM  which  represent  software 
reengineering  from  alternative  viewpoints  include  software 
management  and  acquisition. 

Reengineering  emerges  as  a  strategy  for  bringing 
the  cost  of  developing  and  maintaining  software  under 
control.  The  need  for  a  comprehensive  plan  to  apply 
reengineering  technology  is  the  driving  force  in  the  CIM 
Software  Systems  Reengineering  Program.  The  CIM 
Software  Reengineering  Process  Model  will  assist 
program  managers  facing  this  situation. 
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ABSnUCT 

Legacy  software  is  a  valuable  DoD  asset  which  should  be  leveraged  to  the  greatest  degree  possible.  There  is  an 
immediate  need  for  DoD  guidance  for  conducting  techrucal,  economic,  and  management  analyses  to  determine  when 
reengineering  techniques  are  berieficial  to  conduct.  Sudt  guidance  has  been  developed  under  the  auspices  of  the 
Joint  Logistics  Command  and  the  U.S.  Air  Force.  This  paper  introduces  the  key  concepts  of  the  new  drt^  Software 
Reengineering  Assessmera  Hcmdbook  (MIL-HDBK-SRAH)  which  d^ines  a  process  for  conducting  an  ^ective 
tethrucal,  economic,  and  management  assessmertt  to  determine  whether  and  how  to  reengineer  legacy  software. 

programs  (or,  in  the  first  case  above,  the  single 
program)  is  subjected  to  a  technical  assessment 
process  including  a  set  of  technical  questions 
designed  to  disclose  the  need  to  reoigineer.  Tire 
responses  to  the  question  set  are  used  to  determine 
the  need  to  reengineer  and  to  develop  a  set  of 
candidate  strategies  for  that  program. 


INTRODUCTION 

The  decision  to  reengineer  legacy  software  is  based 
on  a  number  of  technical,  economic,  and  managemmt 
ftKtors.  The  new  draft  Software  Reengineering 
Assessment  Handbook  (MIL-HDBK-SRAH) 
organizes  these  factors  into  a  decision-making  process 
which  is  targeted  for  the  following  two  cases: 


•  A  single  software  program  needs  to  be  assessed  to 
determine  if,  and  how,  it  should  be  reengineered. 
The  SRAH  process  develops  a  recommended 
reengineering  strategy  for  that  program,  if  warranted, 
and  an  estimated  return  on  investment  and  breakeven 
point  for  that  strategy. 

•  A  set  of  software  programs  within  the  organization 
needs  to  be  assessed  to  determine  which,  if  any, 
should  be  reengineered  and  which  reengineering 
strategy(ies)  should  be  used.  The  SRAH  process 
develops  a  reconunended  strategy  for  each  program, 
if  warranted,  and  prioritizes  the  list  of  programs 
based  on  the  need  to  reengineer,  estimated  return  on 
investment,  the  estimated  breakeven  point,  and 
management  considerations. 

The  SRAH  process  is  depicted  in  Figure  1  and 
consists  of  technical,  economic,  and  management 
decision  processes.  The  SRAH  process  begins  with 
the  identification  of  programs  within  the  organization 
and  the  application  of  a  quick  screening  filter  to 
remove  programs  from  consideration  which  are  least 
likely  to  show  a  return  on  investment  through 
reengineering.  The  resulting  list  of  candidate 


An  economic  assessment  process  is  used  to  determine 
the  preferred  strategies  for  each  program.  For  each 
strategy,  conqxuative  total  life  cycle  cost  estimates 
are  developed.  Parametric  cost  estimating  models 
may  be  used  to  estimate  reengineering, 
redevelopment,  and  continued  maintenance/support 
costs.  Cost  risk  and  parameter  sensitivity 
assessments  should  be  considered. 

A  management  decision  process  is  used  to  select  the 
single  recommended  strategy  for  each  program  and  to 
prioritize  the  list  of  programs.  Selection  and 
prioritization  are  based  on  the  composite  results  of 
the  technical  and  economic  assessm«its  and 
management  considerations.  Evaluation  of  the  need 
to  reengineer,  return  on  investment,  breakeven  point, 
and  management  objectives  and  constraints  results  in 
the  recommended  course  of  action. 

IDESmFY  POTENTIAL  PROGRAMS  FOR 
REENGINEERING 

When  evaluating  a  set  of  programs  within  the 
organization  for  reengineering,  the  MIL-HDBK- 
SRAH  process  begins  with  listing  the  potential 
programs  for  reengineering.  This  list  could  contain 
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Figure  1  -  MIL-HDBK-SRAH  Process 


all  the  programs  in  the  organization,  but  should  at 
least  contun  any  program  which  is  perceived  to  be  a 
'problem.*  Programs  over  five  years  old,  programs 
which  require  over  two  manyears/year  to  maintain, 
and  programs  whose  remaining  life  is  over  three 
years  should  be  on  the  list. 

SCRESN  PROGRAMS  FOR  REENGINEE2UNG 
Not  all  of  the  programs  on  the  list  need  to  be 
subjected  to  the  MtL-HDBK-SRAH  process.  A  quick 
screening  filter  is  suggested  as  a  means  to  remove 
programs  from  consideration  which  are  least  likely  to 
show  a  return  on  investment  through  reengineering. 
Programs  whidi  meet  any  of  the  following  criteria 
may  be  eliminated  from  the  list  of  candidates: 

(1)  If  the  program  age  is  less  than  five  years  since 
Initial  Operational  Capability  (IOC)  or  program 
dq>loyment; 

(2)  If  the  annual  nuuntenance  budget  for  software  is 
less  than  $250,000  per  year  or  less  than  two 
manyears  per  year;  or 


(3)  If  the  remaining  life  is  less  than  three  years. 

Programs  which  meet  criterion  (1),  (2),  or  (3)  may 
be  retained  on  the  list  if  desired.  Each  organization 
may  desire  to  develop  its  own  criteria 

REENGINEERING  QVESnONNAlRE 
The  need  to  reengineer  is  viewed  as  a  ftmction  of  the 
following  system  and  organizational  factors; 

A.  Size  (Source  Lines  of  Code  or  Function  Points) 

B.  Conqplexity  (Logical  and  Data  Coitq>lexity) 

C.  Language  IVpe  and  Portability 

D.  Development  Process  Used 

E.  Program  Quality  Trend 

F.  System  Age 

G.  Level  of  Mainteiumce  Activity 

H.  Documentation  State 

I.  Impact  of  System  Failure 

J.  Personnel  Factors. 

Twenty-five  questions  are  answered  for  each 
candidate  program.  The  questions  are  designed  to 
disclose  the  need  to  reengineer.  Figure  2  shows  the 


SYSTEM  FACTORS 

Factor  A  -  Size 

1 .  SLOC.  How  many  Source  Lines  of  Code  (SLOC)  exist  in  the  candidate  software? 

2-Less  than  ISK  4-Between  ISK  and  lOOK  6-More  than  lOOK 

2.  Function  Points.  How  many  fimction  points  exist  in  the  candidate  software? 

2- Less  than  500  4-SOO  to  2500  6-More  than  2500 

Factor  B  •  Complexity 

3.  Average  SLOC/CSU.  What  is  the  average  number  of  Source  Lines  of  Code  per  Computer  Software 
Unit  (CSU)  in  the  candidate  software? 

3- Less  than  50  6-500  to  200  9-More  than  200 

4.  Average  Cydomatic  Complexity.  What  is  the  statistical  average  of  the  Cyclomatic  complexity  per 
module? 

3-Ten  or  less  6-Betweeo  10  and  20  9-More  than  20 

5.  Average  Essential  Complexity.  What  is  the  statistical  average  of  the  Essential  complexity  per 
module? 

3-Five  or  less  6-Between  5  and  10  9-More  than  10 

6.  Data  Complexity.  Which  response  best  characterizes  the  state  of  system  data? 

3- Data  is  managed  by  relational  databasefs)  which  is/are  in  at  least  3rd  Normal  Form  or  an  object- 
oriented  database.  The  data  dictionaiy  is  current,  l^ta  relationships  are  clearly  documented.  A 
high  degree  of  data  name  rationalization  exists. 

6-Data  is  managed  by  relational  database(s)  in  less  than  3rd  Normal  Form.  The  data  dictionary  and 
data  relationship  documentation  (such  as  Entity-Relation  diagrams)  are  mostly  current;  some  catch¬ 
up  may  be  required  to  support  major  enhancement.  Some  cryptic  names  for  relation  tables  or 
columns  exist. 

9-Data  is  managed  as  flat  files.  Some  data  is  no  longer  used  by  the  program,  although  some  effort 
would  be  required  to  determine  which.  Usable  documentation  of  data  definitions  and  logical 
relationships  do  not  exist. 

Factor  C  •  Language 

7.  4(HJ3GL/Assembler.  What  is  the  tystem’s  principal  language  level? 

2-4GL  4-3GL  6-2GL/ Assembly  Language 

8.  Number  of  Languages.  How  many  programming  languages  does  the  system  use? 

2-One  4-Two  6-More  than  two 

9.  Language  Portability.  Which  situation  best  characterizes  the  need  to  change  source  languages? 

2-No  need  because:  An  iq)proved  HOL  is  being  used,  adequate  software  support  tools  (compilers, 
etc.)  exist,  and  the  implementation  language  permits  adequate  selection  from  a  number  of  host 
processors. 

4- Some  need.  One  of  the  conditions  for  answer  Ml  is  not  true.  There  is  some  resulting  motivation 
to  change  languages. 

6-Strong  need.  More  than  one  of  the  conditions  for  Ml  is  not  true.  Continued  software  evolution  is 
being  constrained  by  the  current  language. 

Factor  D  -  Development  Strategy  Discipline 

10.  Development  Process  Followed?  The  system  was  created  using  a  developmrat  process  that  was: 

2-Rigidiy  followed  4-Sometinies  followed  6-Not  followed 

1 1 .  Change  Control  Discipline  Followed?  The  change  control  procedure  for  the  system  has  been: 

2-Strictly  enforced  4-Loosely  enforced  6-Ad  hoc  or  does  not  exist 

Figure  2  -  MIL-HDBK-SRAH  Question  Set 
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Factor  E  -  Quality  Trend 

12.  Number  of  Errors  Trend.  Over  the  last  6  months,  has  the  number  of  errors: 

3-Decreased  6-Leveled  off  9-lncreased 

13.  Maintenance  Backlog.  Does  the  system  have  a  maintenance  backlog? 

3'No  6-Yes,  but  steady  or  decreasing  9-Yes  and  increasing 

14.  Perceived  (^talUy  Trend  by  Customer/User.  Do  the  system’s  users  think  the  system  quality  is: 

3-Imptoving  6-Remaining  the  same  9-Declining 

Factor  F  -  System  Age 

15.  Age  Since  First  Release.  What  is  the  system's  age  as  measured  from  the  first  release? 

1- Less  than  2  yean  2-2  to  7  years  3-More  than  7  years 

Factor  G  -  Level  of  Maintenance  Activity 

16.  Annual  Change  Traffic  (Past  12  Months).  What  is  the  annual  change  traffic  of  the  system  within  the 
past  12  months? 

2- Lea8  than  S%  4-S  to  10%  6-More  than  10% 

17.  Number  ^  Change  Releases  (Past  5  Years).  How  many  change  releases  have  been  released  within 
the  last  5  yean  or  since  the  last  reengineering  effort,  whichever  is  less? 

2- Leas  than  S  4-5  to  10  6-More  than  10 

Factor  H  -  Documentation 

18.  State  of  Documentation.  The  system’s  documentation  is  best  characterized  as: 

3- Complete  and  current 

6-Mostly  complete  and  riMStly  current 
9-Non-existent  or  inaccurate 


ORGANIZATIONAL  FACTORS 
Factor  I  -  Impact  of  System  Future 

19.  ^ect  of  System  Failure.  If  the  system  failed  what  would  be  the  effect? 

3-Little  or  no  damage  6-Significant  damage  9-Permanait  damage,  major  financial 

loss,  or  potential  loss  of  life 

20.  Contingency  Plan.  Is  there  a  contingency  plan  (current,  recently  tested,  and  ready  at  a  moment’s 
notice)  which  could  be  used  if  the  system  fails? 

3-Yes  or  not  needed 

6- Yes,  but  with  some  difficulty  and  significant  loss  of  efficiency 
9-No 

21.  Contribution  to  User’s  Mission.  How  much  does  the  candidate  software  contribute  to  the  using 
organization’s  nussion? 

3-Not  at  all  6-Small  percentage  9-Significant  percentage 

Factor  J  -  Personnd 

22.  Percent  of  Maintenance  Personnel  Wah  In-depth  System  Knowledge.  For  the  staff  currently 
maintaining  the  system:  What  percentage  of  the  maintenance  personnel  have  in-depth  experience  with 
the  system? 

3-More  than  30%  6-5  to  30%  9-Less  than  5% 

23.  Maintainer  Experience.  What  is  the  average  number  of  years  experience  as  a  maintoiance 
programmer  for  those  who  maintain  the  existing  system? 

3-More  than  5  years  6-2  to  5  years  9-Less  than  2  years 


Figure  2  -  MIL-HDBK-SRAH  Question  Set  (Cont.) 
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24.  Maintainer  Turnover.  What  is  the  percentage  of  maintenance  personnel  turnover  per  year? 

3-L6SS  than  5%  6-Between  5  and  30%  9-More  than  30% 

25.  Original  Developers  Available?  Are  the  original  developers  available  for  consultation? 

3-Yes 

6-Yes,  but  the  system  is  over  S  years  old,  or  the  original  developers  are  not  easily  accessible 
9-No 


Figure  2  -  MBLrHDBK-SRAH  Question  Set  (Coat.) 


question  set.  In  MIL-HDBK-SRAH,  a  questioimaire 
form  is  provided  for  marking  responses  for  each 
question.  The  average  response  for  each  factor  is 
recorded.  The  resulting  points  are  totaled  to  provide 
a  quantitative  result  representing  the  need  to 
reengineer.  Programs  are  then  rank-ordered  by  their 
need  to  reengineer. 

The  question  set  grew  out  of  efforts  at  the  Software 
Technology  Support  Cmter.  In  [STSC92],  Chris 
Sittenauer  and  Mike  Olsem  introduced  a  set  of 
questions  to  assist  in  determining  if  a  program  needed 
to  be  reengineered.  Minor  modific£ti;>i>a  occurred 
during  the  process  of  gaining  technical  consensus 
among  panel  members  during  [SB-I]  and  in  the 
following  months.  As  with  the  entire  SRAH  process, 
a  wider  consensus  on  the  question  set  is  now  being 
sought. 

IDENTIFY  REENGINEERING  STRATEGIES 
Now  that  the  candidate  programs  with  the  greatest 
need  to  be  reengineered  have  been  identified, 
candidate  reengineering  strategies 
can  be  gmerated  for  those 
programs.  The  Reengineering 
Strategy  Selection  Matrix,  shown 
in  Figure  3,  identifies  candidate 
strategies  ba.'ied  on  the  responses 
made  to  the  question  set  in  the 
previous  section. 


ECONOMIC  ASSESSMENT 

At  this  point  in  the  MIL-HDBK-SRAH  process,  a  set 
of  candidate  reengineering  strategies  has  been 
identified  for  each  program  being  considered.  The 
remainder  of  the  process  is  designed  to  perform  an 
economic  assessment  of  the  strategies  and  to  execute 
a  management  decision  process  to  select  one  strategy 
per  program  and  prioritize  the  list  of  programs. 

The  purpose  of  the  economic  assessment  process  is  to 
provide  accurate,  traceable,  and  credible  comparative 
cost  information,  in  a  consistent  format  for  each 
program,  to  allow  rank  ordering  of  the  candidate 
strategies  according  to  breakeven  point  (BP)  and 
return  on  investment  (ROI). 

The  economic  assessment  approach  is  direct, 
repeatable,  and  logical.  Each  program  is  analyzed 
separately  and  independently.  For  eadli  program,  the 
candidate  strategies  are  evaluated  by  estimating  total 
remaining  life  cost  (RLC),  which  is  the  sum  of 
investment  (development)  and  support  costs  for  the 
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Figure  3  -  Reengineering  Strategy  Selection  Matrix 
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defined  remaining  lifetime  of  the  software.  For 
reengineering  strategies,  RLC  also  includes  the  cost 
of  supporting  the  legacy  software  during  the 
reengineering  development  time.  All  estimates  for 
each  program  will  be  based  on  the  same  parametric 
model  and  the  same  general  assumptions  to  allow  the 
results  to  be  compared.  The  preferred  strategies  are 
defined  as  those  strategies  whose  breakeven  point  is 
less  than  the  remaining  life  of  the  legacy  software. 

COST  ELEMENT  STRUCTURE  (CES) 

Each  economic  assessment  will  first  establish  a  CES 
to  be  used  for  the  analysis.  The  CES  may  be  similar 
to  that  shown  in  Figure  4  or  may  be  tailored  to 
satisfy  any  of  the  following  requirements: 


estimating  and  model  calibration  may  be  found  in 
source  documents  listed  in  SRAH. 

The  following  list  identifies  the  models  which  have 
been  identified  for  use  in  the  SRAH  economic 
assessment  process: 

•  COCOMO  •  SEER-SEM 

•  PRICE  S  •  SUM 

•  REVIC  •  SOFTCOST 

A  Reengineering  Size  Adjustment  (RESIZE)  model  is 
included  in  SRAH  as  an  Appendix  and  describes  a 
method  of  adjusting  source  lines  of  code  for  a 
reengineering  project  for  input  to  a  cost  model. 


(1)  To  match  the  CES  from  a  useful  and  earlier 
estimate  for  this  program, 

(2)  To  match  the  CES  output  from  the  parametric 
cost  estimating  model  used  (the  situation  experienced 
in  the  above  example),  or 

(3)  To  highlight  a  particular  cost  sensitivity,  e.g.,  to 
provide  more  granularity  into  the  maintenance 
estimate. 

Since  the  economic  assessment  is  a  comparative 
process,  it  is  not  necessary  to  establish  an  exhaustive 
CES,  particularly  where  the  cost  of  the  elements 
would  be  the  same  or  similar  for  all  strategies 
(facilities  or  utilities,  for  example).  Costs  may  be 
estimated  at  the  CSU,  CSC,  or  CSCI  level  and 
aggregated  to  program  level. 

MIL-HDBK-SRAH  leads  the  analyst  through  a 
process  to  consider  applicable  ground  rules  and 
assumptions  which  need  to  be  documented  with  the 
results  of  the  assessment. 

MODEL  SELECTION  AND  DATA  GATHERING 
Software  estimates  are  normally  made  by  level  of 
effort,  analogy,  or  parametric  methods.  For  the 
handbook,  only  parametric  methods  (cost  models)  are 
considered.  Examples  using  several  parametric 
models  are  provided  as  Appendices  in  MIL-HDBK- 
SRAH.  Reasons  for  selecting  a  particular  model  may 
be  familiarity,  availability  of  the  model  or  particular 
input  data,  suitability  of  the  model  CES,  or  a  desire 
to  be  comparable  with  an  earlier  estimate.  In  any 
case,  the  same  model  should  be  used  for  all  estimates 
to  be  compared.  General  information  on  model 


ECONOMIC  INDICATORS 

The  economic  assessment  of  candidate  strategies  for 
each  program  depends  on  estimating  and  comparing 
economic  indicators  (RLC,  RLC  Savings,  ROI,  and 
BP)  for  each  strategy,  using  the  same  estimating 
model.  Figure  S  is  an  illustrative  example  of  a 
summary  of  the  economic  assessment  of  five 
strategies  for  a  single  program.  Strategy  llfl. 


1 .0  InvMUnem  (DsvalopiMni) 

1. 1  Softwara  Devalopment 

1.1.1  Requiremeau  Anilyfii 

1.1.2  PrelimiiWfy  Design 

1.1.3  Detailed  Design 

1.1.4  CSU  Code  and  Test 

1.1.5  CSC  Integration  and  Test 

1.1.6  CSCI  Testing 

1.1.7  System  Integnlion  and  Test 

1.1.8  Operational  Test  &  Evaluation 

1.1.9  Site  preparation 

1.1.10  Developmem  Tools 

1.1.11  Program  Management 

1.1.12  Documentation 

1 .2  Site  Preparation 

1.3  Training 

1 .4  Development  Tools 

1 .5  Hardware  Developmem 
2.0  Support 

2.1  Reengineered  Software 

2. 1 . 1  Software  Maintenance 

2.1.2  System  Operations 

2.1.3  Hardware  Maintenance 

2.1.4  Training 

2.2  L^acy  Software 

2.2.1  Software  Mainteiumce 

2.2.2  System  Operations 

2.2.3  Hardware  Maintenance 

2.2.4  Training 


Figure  4  -  Generic  CES 
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continued  nuuntenance  (Maintain  Status  Quo)  of  the 
legacy  software,  is  the  baseline  against  which  the 
other  strategies  are  compared.  Each  of  the  three 
reengineering  strategies  (#2, 1(3,  #4)  would  require  an 
investment  (development  cost)  to  achieve  the  savings 
shown.  The  final  strategy  (#S)  would  require  full 
replacement  of  the  legacy  software  and  would  incur 
the  greatest  investment  of  all  candidate  strategies. 

The  RLC  Savings  is  defined  as  the  RLC  of  Strategy 
#1  minus  the  RLC  of  the  strategy  under 
consideration.  The  breakeven  point  is  defined  as  that 
point  in  time  when  the  RI C  of  a  strategy  equals  the 
RLC  of  Strategy  #1,  i.e.,  the  cost  of  reengineering 
equals  the  cost  of  continuing  to  maintain  status  quo. 
A  detailed  version  of  the  worksheet  is  contained  in 
MIL-HDEK-SRAH. 

In  this  example,  the  Cost  Element  Structure  (CES) 
was  summarized  (rolled-up).  Risk  and  sensitivity 
assessments  were  excluded  from  this  example  for 
siii^>liciQ'.  The  SRAH  process  provides  a  detailed 
method  for  including  risk  and  sensitivity  assessments 
into  the  economic  analysis. 


From  the  illustrative  example  in  Figure  S,  the 
following  conclusions  can  be  drawn: 

•  Strategy  #1  (Maintain  Status  Quo)  is  a  low  risk 
candidate  (no  investment),  but  its  ROl  (arbitrarily  set 
at  zero)  places  it  lower  than  the  preferred  candidates. 
All  other  strategy  RLCs  will  be  compared  to  its 
RLC. 

•  Strategy  1(2  (Redocument)  has  the  lowest 
investment  of  all  strategies,  but  only  moderate  RLC 
Savings  (as  compared  to  (HI)  over  the  remaining  life. 
The  ROI  ranking  is  the  same  as  the  BP  ranking. 

•  Strategy  (O  (Translate)  is  a  poor  choice, 
demonstrating  a  higher  RLC  than  #1,  the  highest 
investment,  and  negative  ROl  values.  Note  also  that 
the  BP  (12.8)  years  exceeds  the  remaining  life  (12 
years).  This  is  not  a  preferred  strategy. 

•  Strategy  #4  (Restructure)  is  clearly  the  preferred 
choice,  showing  the  best  ROI  and  the  earliest  BP. 


Proflrani:  NIusttativ*  Case 
FYMIK 

Maintain 
Status  Quo 

Redocument 

Translate 
Source  Corfe 

Restructure 

Redevelop 

EOTmttffg 

8LOC 

31,574 

31.574 

31,574 

31,574 

31,574 

Design  ModifM  (0M| 

0% 

0% 

0% 

10% 

50% 

Code  Modified  (CM) 

0% 

0% 

100% 

20% 

80% 

Integ  8  Test  Modified  (IM) 

0% 

0% 

100% 

100% 

100% 

New  Documentation  (ND) 

0% 

100% 

20% 

50% 

80% 

Equiv  8LOC  (ESLOC) 

0 

6,946 

16,366 

14,556 

24,975 

Annual  Ctig  Tiaflic  (ACT) 

20% 

19% 

18% 

16% 

15% 

Remaining  Years  (Y) 

12.0 

12.0 

12.0 

120 

12.0 

Reengineering  Years  (YR) 

0.0 

0.6 

0.9 

08 

1 

Support  Years  (YS) 

12.0 

11  4 

11  1 

11.2 

110 

g.ffa  fey  CES 

1.0  Investment 

SO 

S344 

SI. 179 

$836 

$1,597 

2.0  Support 

S4.232 

S3.650 

S3.136 

$2,496 

$2,098 

2.1  Reengineered  Software 

$0 

S3.435 

S2.818 

$2,210 

$1,746 

2.2  Legacy  Srrftware 

S4.232 

S215 

S319 

$286 

$3Si 

Economic  Indicators 

1.  Remaining  Ufa  Cost  (RLC) 

$4,232 

S3.994 

S4.315 

$3,331 

$3,695 

2.  RLC  Savings  vs  #1 

so 

S239 

($83) 

$901 

$537 

3.  Return  On  InvestriMnt  (ROI) 

a.  Investment  ROI  (ROI,) 

0 

069 

(007) 

1  08 

0  34 

Rank  .>rder 

4 

2 

5 

1 

3 

b.  Tout  ROI  (ROM 

0 

006 

(0  02) 

027 

0  15 

Rank  Order 

4 

3 

5 

1 

2 

4.  Annual  Support  (AS) 

S353 

S301 

$254 

$197 

$159 

S.  Annual  Support  Savings 

SO 

S52 

$99 

$156 

$194 

6.  Breakeven  Point  (BP)  in  Years 

12.0 

72 

128 

62 

8  2 

Ranh  Order 

4 

2 

5 

1 

3 

7.  Preferred  Reengineering  Strategy 

N/A 

Yes 

No 

_ 1 

Yes 

Yes 

Figure  5  -  Summary  Comparison  Worksheet  for  a  Single  Program  (Example) 


•  Strategy  ffS  (Redevelop)  is  clearly  the  greatest 
investment,  provides  only  modest  RLC  Savings,  but 
still  displays  a  BP  within  the  remaining  life.  It  may 
also  incur  the  greatest  cost  and  schedule  risk, 
something  that  should  be  investigated  in  accordance 
with  procedures  described  further  in  MIL-HDBK- 
SRAH. 

Ranking  of  strategies  is  assigned  in  ascending  order 
of  BP  or  ROI,  with  the  preferred  strategies  being  #4, 
#5,  m,  and  #1  in  that  order. 

COST  RISK  AND  SENSnTVITY 
MIL-HDBK-SRAH  includes  a  process  for 
reevaluating  the  results  of  the  economic  assessment 
and  assessing  the  associated  cost  risks  and  parameter 
smsitivities.  Guidance  for  documenting  the  economic 
assessment  process  is  provided. 

MANAGEMENT  DECISION  PROCESS 
At  this  point  in  the  MIL-HDBK-SRAH  process,  a  set 
of  preferred  strategies  has  been  identified  for  each 
program  being  considered.  The  remainder  of  the 
process  is  to  select  one  strategy  per  program  and 
prioritize  the  list  of  programs.  Selection  and 
prioritization  are  baaed  on  the  composite  results  of 
the  technical  and  economic  assessments  and 
managemmt  considerations.  Evaluation  of  the  need 
to  reengineer,  econonuc  indicators,  and  management 
objectives  and  constraints  results  in  the  recommended 
course  of  action. 

While  the  technical  and  economic  assessment 
processes  are  well  defined  in  SRAH,  the  management 
decision  process  is  more  sensitive  to  less  well  defined 
and  less  tangible  program  considerations  and  is  based 
primarily  on  judgement.  Non-quantifyable  program 
considerations  include  perceived  risk,  resource 
availability,  requirements  realism,  estimate 
credibility,  and  schedule  uncertainty. 

The  primary  objectives  of  the  management  decision 
process  are  to  reduce  software  support  cost,  improve 
software  quality,  and  meet  other  management  or 
organizational  objectives. 

The  ideal  choice  for  the  highest  priority  program 
would  be  if  all  of  the  following  were  true' 


•  Investment  cost  within  budget 

•  High  confidence  in  estimate  and  schedule 

•  Remaining  life  confirmed 

•  Legacy  software  has  the  highest  support  cost 

•  High  probablilty  of  project  success. 


msrroRY  of  miliidbk-srah 
The  Joint  Logistics  Commanders  (JLC)  Joint  Policy 
and  CooMinating  Group  (JPCG)  on  Computer 
Resources  Management  (CRM)  initiated  the  first  draft 
of  this  handbook  at  the  First  Software  Reengineering 
Workshop  [SB-1]  in  September  1992.  The  handbook 
was  developed  by  the  members  of  the  Reengineering 
Economics  Panel  at  the  workshop  and  was  entitled 
Reengineering  Economics  Handbook  (MIL-HDBK- 
REH).  In  April  1993,  Conqitdc  Federal  Systems, 
Inc.  and  Management  Consulting  &  Research,  Inc., 
refined  and  enhanced  the  handbook  under  contract  to 
the  Air  Force  Cost  Arudysis  Agency.  Refinement 
and  use  of  MILrHDBK-SRAH  continues  under  Joint 
Logistics  Command  and  Air  Force  Cost  Analysis 
Agmcy  direction. 

FOR  FARTHER  INFORMATION 
Further  information  regarding  MII^HDBK-SRAH 
may  be  obtained  through  the  Air  Force  Cost  Analysis 
Agency  (AFCAA)  from  Mr.  John  B.  Donald, 
AFCAA,  1111  Jefferson  Davis  Hwy,  Suite  403, 
Arlington,  VA  22202;  phone:  (703)746-5865  or 
(703)692-0(X)6;  MILNET;  donald@afcost.af.mil. 
Copies  of  SRAH  may  be  obtained  from  the  Air 
Force  Software  Technology  Support  Center  (STSC), 
Hill  AFB,  Utah;  phone:  (801)777-8045;  or  Mr. 
Chris  Sittenauer  at  the  STSC;  phone:  (801)777-9730. 
Comments  regarding  the  handbook  are  particularly 
solicited  and  should  be  sent  to  Mr.  John  Clark, 
COMPTEK  Federal  Systems,  Inc.,  2877  Guardian 
Lane,  Va  Beach,  VA,  23452;  phone:  (804)463-8500; 
clark®comptek.  mhs.compuserve.com. . 
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•  Greatest  technical  need  to  reengineer 

•  Earliest  BP  within  the  remaining  life 

•  Highest  ROI 
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Abstract 

Mutually  exclusive  leengmeering  design  alternatives 
for  modif^g  or  extending  complex  systems  must  be 
evaluated  rigorously  before  the  most  attractive  option  can 
be  identified  for  implementation  This  paper  presents  a 
Design  Dependent  Darameter  (DDP)  ^roach  utilizing  a 
general  Design  Evaluation  Function  (DEF)  to  guide  the 
search  through  the  redesign  space.  Selection  of  the  best 
reengineering  alternative  is  aided  by  the  use  of  a  Design 
Evaluation  Di^lay  (DED)  incorporating  multiple  criteria. 
Both  the  DEF  and  the  DED  will  be  explained  and 
illustrated  for  hypothetical  systems  of  dq)lo|y^  rqxiirable 
equipment 

L  The  system  reengineering  process* 

A  reengineering  design  effort  is  needed  whenever  a 
deficiency  is  recognized  in  an  existing  system,  or  when 
the  system  fiuls  to  perform  its  intended  mission 
satis&ctorily.  Deficiencies  occur  and  may  be  revealed  for 
several  reasons: 

1)  The  actual  observed  performance  of  the  system  after 
deployment  does  not  fully  meet  the  performance 
requirements  for  which  it  was  designed. 

2)  The  performance  requirements  change,  or  are 
expanded,  after  the  system  has  come  into  being  and 
the  system  does  not  meet  these  new  requirements. 

3)  The  operating  enviromnent  for  which  the  system  was 
originally  designed  has  changed,  making  the  existing 
system  inadequate  for  its  intended  missioiL 

System  reengineering  evaluation  requires  the 
systematic  application  of  a  rigorous  procedure  that  can 

*Syftem  reengineering  thould  be  punued  is  an  integral  part  of  the 
fyitem  engineering  process.  A  system  engineering  process  model  is  being 
developed  at  Virginia  Tedi  by  Blandiard,  Bowen,  Fabtydcy,  Nance,  and 
Verma  under  NS  WC  Contract  #N6092 1 -89-A239-0027. 


guide  redesign  efforts  toward  a  preferred  solution.  This 
evaluation  procedure  can  also  be  useful  in  identifying 
systems  that  need  to  be  reengineered.  A  systematic 
reengineering  procedure  is  illustrated  in  Figure  1. 

Block  1  in  Figure  1  shows  the  deficiency  in  the  system 
to  be  the  genesis  of  the  reengineering  process.  This 
deficiency  in  the  existing  system  should  te  the  stimulus 
that  bpg^  the  process  of  reengineering.  The  deficiency 
must  be  clearly  identified  so  that  the  reengineering  effort 
can  truly  focus  on  the  need. 


(1)  (S) 


Figure  1  •  A  Systematic  Reengineering  Procedure 
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This  deficiency,  or  need  for  reengineering,  arises 
because  the  system  does  not  satisfy  specific  mission  or 
design  criteria.  These  criteria  must  be  identified  as 
indicated  in  Block  2  of  Figure  1.  Criteria  will  usually  be 
both  economic  and  noneconomic. 

Multiple  criteria  considerations  in  life<ycle 
reengineering  analyses  arise  when  both  cost  and 
effectiveness  elements  are  present  in  the  evaluation  as 
shown  in  Figure  2.  Effectiveness  is  a  measure  of  mission 
fulfillment  for  a  system  in  terms  of  the  stated  operational 
need.  Mission  fiilfillment  may  be  expressed  by  one  or 
mote  figures  of  merit,  depending  on  the  type  of  system 
and  mission  objectives. 


Figure  2  •  Elements  for  Cost-Effectiveness 


Next,  multiple  criteria  must  be  expressed  as 
quantifiable  go^  for  the  reengineered  system.  This 
important  process  is  identified  in  Block  3  of  Figure  1  and 
should  embrace  the  elements  in  Figure  2. 

Quantification  is  a  not  an  easy  task.  Most  systems 
have  multiple  purposes  that  significantly  complicate  the 
measurement  Criteria  such  as  availability,  reliability, 
maintainability,  etc.  are  usually  used  since  th^  are  more 
easily  quantified  than  utility,  merit  gain,  etc.  Also,  it  is 
appropriate  to  establish  the  means  with  which  to  quantify 
criteria  for  each  alternative.  Indirect  ejqwrimentation 
techniques  utilizing  mathematical  and  simtdation  models 
come  into  play  here. 

As  indicated  by  Block  4  of  Figure  1,  the  existing 
system  must  be  baselined  against  established  design 
criteria.  This  baselining  should  illuminate  the  deficiency 
in  the  system.  Baselining  is  used  as  a  basis  for 
comparison  with  proposed  reengineering  design 
alternatives.  In  the  case  of  an  existing  system,  design 
dependent  parameters  such  as  reliability,  design  life, 
weight,  etc.  are  fixed  the  current  desigiL  If  the  system 
deficiency  is  caused  by  an  operating  environment  change 
(e.g.,  higher  operating  costs,  mission  change,  higher 
shortage  costs),  the  baselining  of  the  system  must 


incorporate  these  factors  in  order  for  the  system 
deficiency  to  be  accurately  identified. 

After  baselining  of  the  existing  system  is  complete,  the 
process  of  reengineering  design  may  begin.  The 
effectiveness  of  these  designs  must  be  evaluated  so  that 
comparison  with  competing  design  alternatives  and  the 
baseline  is  possible.  Only  when  each  reengineering 
design  has  been  evaluated  can  the  best  option  be  selected 
for  implementatioa 

EL  Reengineering  design  evaluation 

Reengineering  design  evaluation  is  part  of  the  system 
reengineering  process  of  Figure  1.  Blodts  5  through  8  in 
Figure  1  illustrate  the  remaining  stq». 

After  deficiencies  in  the  existing  system  have  been 
identified,  changes  (reengineering  design  options)  must 
be  developed  and  presented  as  mutually  exclusive 
alternatives.  Cost  and  effectiveness  measures  for  these 
alternatives  are  generated  using  the  established  criteria. 
Each  alternative  is  then  displayed  beside  the  baselined 
existing  system,  and  the  lowest  cost  alternative  that  most 
closely  meets  other  criteria  is  selected  as  the  best 
reengineered  system  desiga 

Figure  3  illustrates  a  concurrent  approach  to  the 
redesign  generation  and  evaluation  portion  of  the  system 
reengineering  process.  In  Blocks  1  and  2  of  Figure  3, 
human  designers  (using  CAD/CAE  tools)  develop 
reengineering  design  alternatives.  This  activity 
corresponds  to  Block  S  in  Figure  1. 

Block  6  in  Figure  1  identifies  the  activity  of  estimating 
and  predicting  Design  Dependent  Parameter  values  for 
each  alternative  (see  also  Block  3  in  Figure  3).  These 
parameter  values  provide  a  basis  for  comparison  with 
established  design  criteria  to  determine  the  merit  of  each 
alternative.  They  m^  be  passed  to  Block  1  in  Figure  3  as 
appropriate. 


Figure  3  -  Concurrent  Engineering  Morphology 
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Redesign  alternatives  must  also  be  judged  from  a 
system  life-cycle  perspective.  In  Block  7  of  Figure  1,  the 
cost-effectiveness  of  each  alternative  is  determined  by 
using  a  life-cycle  oriented  evaluation  function  (also  see 
Block  4  in  Figure  3).  Only  after  each  alternative  has  been 
rigorously  evaluated  can  the  least  costly  be  selected  which 
satisfies  ^1  other  criteria. 

The  evaluated  alternative  is  compared  to  the  baseline 
(or  to  the  best  alternative  at  this  point)  as  shown  in  Blodc 
8  of  Figure  1  (also  see  the  link  between  Block  4  and  Block 
1  in  Figure  3).  In  these  situations,  decision  evaluation  is 
fiKilitated  by  the  use  of  a  decision  evaluation  aid  visually 
exhibiting  both  cost  and  effecUveness  measures.  Life¬ 
cycle  cost  and  one  or  more  effectiveness  measures  may  be 
diqrlayed  simultaneously  as  an  aid  in  decision  making.  A 
De^ion  Evaluation  Display  (DED),  as  shown  in  Figure 
4,  is  one  way  of  doing  this. 


ElfeCttV«IMU 

aptiformanc* 

mMSWM 


AHtmativts 


Figure  4  •  The  Decision  Evaiuation  Oispiay 


Note  that  effectiveness  requirements  (or  thresholds)  are 
shown  on  the  DED.  These  are  useful  to  the  decision 
maker  in  assessing  (subjectively)  the  degree  to  which  each 
alternative  satisfies  effectiveness  criteria.  Each 
alternative  is  displayed  with  its  effectiveness  measures 
shown  in  relation  to  the  established  criteria.  Life-cycle 
cost,  shown  on  the  horizontal  axis,  is  an  objective 
measure.  The  goal  is  to  select  the  alternative  with  the 
lowest  life-cycle  cost  that  best  satisfies  important 
performance  and  effectiveness  measures. 


nL  Reengineering  evaluation  theory 

Models  and  the  process  of  indirect  experimentation 
provide  a  convenient  means  for  obtaining  factual 
information  about  a  new  ^em  being  designed,  or  an 
existing  system  which  nee^  to  be  improved.  In  most 
design  and  operational  situations,  the  objective  sought  is 
the  optimization  of  effectiveness  measures  economically. 
Rarely,  if  ever,  can  this  be  done  by  direct  experimentation 


with  a  system  under  development  or  with  one  that  already 
exists.  The  primary  use  of  simulation  in  any  system 
reengineering  effort  is  to  explore  the  effects  of  alternative 
system  characteristics  on  system  performance  and 
effectiveness,  without  actually  producing  and  testing  each 
candidate  system. 

In  system  reengineering,  mathematical  models  may  be 
used  during  design  evaluation  of  the  baseline  (existing 
system).  Then,  these  models  can  be  invoked  for  each 
redesign  alternative.  Design  Dqrendent  Parameters 
(DDP)  are  the  key.  These  parameters  are  design 
characteristics  inhei^  in  the  physical  equipment  which 
are  subject  to  manipulation  by  the  designer  during  the 
process  of  seddng  the  best  design.  When  imbedded  in 
models,  these  parameters  ate  the  key  to  indirect 
erqterimentation  during  the  system  reengineering  process. 

Design  evaluation  in  terms  of  life-cycle  cost  and 
qrstem  effectiveness  can  be  fricilitated  by  utilizing  the 
Design  Evaluation  Function  (DEF)  shown  in  Figure  S. 
This  function  is  a  mathematical  way  to  link  design 
actions  with  operational  outcomes.  It  incorporates  Design 
Dependent  Darameters.  From  the  following  definitions  of 
terms  in  the  DEF,  the  structure  for  system  redesign 
optimization  should  be  evident; 

E  =  a  life-cycle  complete  evaluation  measure 
(usually  equivalent  life-cycle  cost) 

X  =  design  variables  (e.g.,  number  of  deployed 
units,  armor  thidatess.  retirement  age.  repair 
capaci^,  rated  thrust,  etc.) 

Yj  ~  d^gn  dqrendent  parameters  (e.g.,  weight, 
reliability,  design  life,  load  capacity, 
produdbiliD',  maintainabili^,  etc.) 

Y|  design  independent  parameters  (e.g.,  cost  of 
mon^,  ldx>r  rates,  material  cost  per  unit, 
shortage  cost  penalty,  etc.) 


To  Optimally  Achieve 
ACQUISITION  /  UTILIZATION 
Mathematically  Link 
DESIGN  /  OPERATIONS 
Utilizing  The 

DESIGN  EVALUATION  FUNCTION  (DEF) 
For  Choosing  Among  Design  Alternatives 
E-f(X;Yd.Y,) 

Figure  5  -  The  Design  Evaluation  Function 
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The  Design  Evaluation  Function,  with  its  Design 
Dependent  Parameters  and  design  independent 
parameters,  fedlitates  optimization  within  each 
alternative.  It  provides  the  basis  for  a  classification  of  the 
true  difierence  between  redesign  alternatives  (a  design- 
based  choice)  and  optimization  (a  search-based  choice). 

Reengineering  of  the  system  can  take  many  forms. 
The  ^em  might  be  reengineered  by  reoptimizing  the 
current  design's  logistics  support  system.  Increasing  the 
number  of  units  deployed,  or  the  number  of  repair 
channels,  might  provide  the  needed  change  to  overcome 
the  deficiency.  However,  the  examples  presented  here  are 
for  cases  whim  reengineering  consists  of  redesign  of  the 
physical  equipment  to  chartge  Design  Dependent 
Panunetervdues. 

Two  cases  are  considered.  The  first  example  is  a 
situation  wiiete  the  reengineering  problem  arises  because 
actual  observed  performance  measures  of  a  system  do  not 
meet  established  criteria.  As  shown  in  Figure  6,  the 
baseline's  estimated  performance  does  not  accurately 
predict  the  actual  effectiveness  and  performance  of  the 
system  and  its  life-cycle  cost  (Case  A). 


ERMUvmim* 

a  pcrforiiHficc  BatcttM  BasellM 

Hwuuras  EsiimatM  Actuals 


Figure  6  -  OED  for  Redesign  Case  A 

For  this  reengineering  case,  design  requirements  are 
assumed  not  to  change  since  the  original  design  effort,  but 
the  actual  depleted  system  fails  to  meet  one  or  more  of 
the  requirements.  Thus,  reengineering  of  the  ^em  is 
indicat^.  Redesigned  system  alternatives  should  be 
offered  to  compete  with  the  baseline  (the  existing  system). 

In  this  situation,  the  methods  of  predicting 
performance  measures  used  during  design  need  to  be 
redeveloped.  Since  the  system  failed  to  perform  as 
predicted  the  prediction  methods  are  probably  deficient 
These  methods  must  be  corrected  to  account  for  errors  in 
the  original  design  predictions  and  estimates.  The 
methods  used  need  to  be  refined  so  that  estimated 
performance  matches  actual  performance. 


The  second  example  is  a  situation  where  the  system 
requirements  change  (tougher  performance  criteria  or 
new  criteria)  from  the  requirements  used  in  the  original 
design.  This  situation  is  ^own  in  Figure  7,  where  actual 
values  of  the  baseline's  measures  no  longer  meet  all  of  the 
new  criteria  (Case  B).  As  represented  in  Figure  7  by  a 
movement  down  of  one  of  the  horizontal  lines,  one  of  the 
design  q)ecifications  changes.  This  will  cause  the  system, 
which  may  have  met  all  original  requirements,  not  to 
meet  all  new  requirements. 

CffactlviiMM 
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Both  cases  outlined  above  will  be  illustrated  by  a 
iQpotfaetical  situation  involving  a  population  of  repair^le 
equipment  which  is  deployed  to  meet  a  demand.  This 
hypmhetical  example  situation  is  called  the  Repaiarble 
^uipment  Population  System  (REPS). 

IV.  Hypothetical  reengineering  examples 

Two  hypothetical,  but  realistic,  REPS  examples  will  be 
presented  in  this  sectioa  These  were  fashioned  to 
illustrate  how  modeling  and  indirect  experimentation  for 
system  optimization  applies  to  system  reengineering 
design  ei^uation  for  b^  a  Case  A  and  a  Case  B 
situatiorL 

REPS  example  overview 

Consider  the  following  situation:  A  finite  population 
of  repairable  equipment  is  maintained  in  operation  to 
meet  a  demand  As  equipment  units  fail  or  become 
unserviceable,  they  are  repaired  and  returned  to  service. 
As  th^  age,  the  older  units  are  removed  from  the  system 
and  replaced  with  new  units.  The  problem  is  to 
determine  the  population  size,  the  replacement  age  of 
units,  and  the  number  of  repair  charmels  for  a  given 
design  alternative,  or  set  of  design  dependent  parameter 
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values,  so  that  requiiements  will  be  met  at  a  minimum 
life-cycle  cost 

A  general  schematic  of  REPS  is  shown  in  Figure  8. 
Repairable  equipment  systems  exist  in  many  operational 
settings.  BoA  the  military  and  the  airlines  operate  and 
maintain  aircraft  with  these  system  characteristics,  in 
ground  transit,  vehicles  such  as  rental  automobiles,  taxis, 
and  commercial  trucks  constitute  repairable  item  systems. 
Production  equipment  ^pes  such  as  autoclaves,  drill 
presses,  and  weaving  looms,  are  populations  of  equipment 
which  fit  the  rqrairable  classificatioiL  Additionally,  both 
military  and  commercial  organizations  maintain 
peculations  of  repairable  system  components  such  as 
motors,  pumps,  valves,  etc. 


OamMid 


Figure  8  -  Repairable  Equipment  Population  System 


Table  1  summarizes  the  design  variables  and  system 
parameters  for  REPS.  The  Design  Dependent  Parameters 
of  unit  cost  (C„),  reliability  (MTBF),  and  maintainability 
(MTTR),  are  central  to  the  system  design  problem.  DDP 
values  are  inherent  in  the  design  of  the  equipment  itself. 
Design  independent  parameter  values  depend  solely  on 
the  operating  envirorrment  of  the  equipment  They 
include  demand  (D),  shortage  cost  (C,),  interest  rate  (0, 
etc.  Design  variables,  on  the  other  hand,  are  those  factors 
that  are  adjustable  and  used  to  optimize  within  each 
design  alternative  to  find  the  lowest  life-cycle  cost 


Table  1  -  Design  Variables  and  System  Parameters 


Variables/Panuneters 

Design 

Var. 

Design 

Dep. 

Deaign 

Ind. 

D  ■  demand  in  unita 

X 

N  •  number  of  unita  to  deploy 

X 

M  ■  number  of  maintenance  channela 

X 

0  ■  letiiement  age 

X 

Cy  ~  annual  etjuivalent  unita  eotf  per  unit 

X 

>  annual  equivalent  channel  cort  per  channel 

X 

annual  overhead  coat  per  channel 

X 

C,  ■■  ihoflage  coat  per  unit  ahort  per  year 

X 

MTBF  =  mean  time  between  failure 

X 

MTTR  =  mean  time  to  repair  a  unit 

X 

Figure  9  illustrates,  graphically,  the  optimization 
process  based  on  three  REPS  design  variables:  number  of 
depli^ed  units  (N),  retirement  age  (n),  and  number  of 
repair  charmels  (M).  Various  combinations  of  values  for 
these  system  design  variables  are  searched  until  the 
lowest  life-cycle  cost  for  the  alternative  is  found.  This 
life-cycle  cost  results  fiom  the  optimal  mix  of  values  for 
the  design  variables  for  this  alternative. 


Figure  9  >  REPS  AKematlve  Optimization 

Indirect  experimentation  with  REPS  may  proceed  on 
the  basis  of  tte  Design  Evaluation  Function  of  Figure  S, 
E  f(X;  Y|).  For  REPS,  annual  equivalent  life-cycle 
cost  (AELCC)  will  be  used  as  the  economic  metric. 
AELCC  is  the  total  of  pt^mlation  annual  equivalent  cost 
(PC),  repair  facility  annual  cost  (RC),  annual  operating 
cost  ((X),  and  annual  shortage  cost  (SC),  expressed  as: 

AELCC  >=  PC  +  RC  +  OC  +  SC. 

Population  Annual  Equivalent  Cost  is  PC  =  Cy(N) 
where  C„  “  (P-BXA/P,  /,  n)  +  B(i).  This  annual 
equivalency  formula  uses  first  cost  per  unit  (P)  and  its 
bwk  value  at  retirement  (B).  In  these  examples,  book 
value  is  determined  using  salvage  value  and  straight-line 
depreciatioa  Repair  Facility  Annual  Cost  is  RC  Cr(M). 
Annual  Operating  Cost  is  OC  *  (EC  +  LC  +  PMC  + 
Other)(N);  energy  costs,  labor  costs,  preventive 
maintenance  costs,  and  other.  Annual  Shortage  Cost  is 
SC  =  C,[E(S)]  where  E(S)  is  the  expected  number  of  units 
short  E(S)  is  found  by  multiplying  the  number  of  units 
short  fy  the  probability  of  that  occurrence  and  surruning 
across  all  instances.  Refer  to  the  Appendix  for  detailed 
derivation  of  elements  comprising  A]^CC. 

Calculations  are  done  repeatedly  with  various 
combinations  of  N,  n,  and  M  until  an  optimum  AELCC  is 
achieved  for  each  alternative  (each  instance  of  DDP 
values).  Each  alternative  can  then  be  compared  to  the 
current  best  to  determine  if  all  criteria  have  been  met  with 
AELCC  reduced  as  far  as  possible. 


228 


REPS  redesign  example  (Case  A) 

Consider  a  depli^ed  REPS  for  which  the  system 
attributes  axe  as  li^ed  in  T2d>le  2.  Suppose  that  the^  are 
the  actual  peiformanoe  measures  expetienoed  REPS 
after  deploymenL  Also,  assume  that  the  criteria  for  this 
system  during  its  original  design  are  still  in  effect  They 
are  as  follows: 

1)  Design  to  cost  •  the  dqrloyed  population  shall  have  a 
first  cost  not  exceeding  Sl.OOOK. 

2)  Probability  of  shortages  -  the  probability  of  one  or 
more  equipment  units  short  demand  shall  not 
exceed  0.38. 

3)  Equipment  reliability  -  the  mean  time  between  fiuluie 
for  equipment  units  shall  not  be  less  than  0.25  years. 


Table  2  •  Attributes  fbr  Baseliiw  Design 


Attributes 

Baseline  Actuals 

Design  Variable  Values: 
Population.  N 

20 

Repair  channels,  M 

4 

Retirement  age,  n 

4 

Independent  Parameters: 
Demand 

IS 

Shortage  cost  per  year 

S73.000 

Interest  rate 

lOK 

Repair  channel  cost 

$45,000 

Dependent  Parameters: 

Design  life  in  years 

$43,000 

Salvage  value 

$5,000 

Design  life  in  years 

6 

Openting  costs 

$2,300 

Mcasnred/Calculated: 

Prob.  one  or  more  short 

0.27 

AELCC 

$468K 

Average  MTBF 

a22- 

Avera^MTTR 

0.045 

*Vioiatioo  of  criteria 


The  actual  reliability  does  not  meet  the  specified 
criteria.  During  the  original  design  effort,  the  system 
performance  estimates  met  all  the  requirements. 
However,  after  deployment,  a  deficiency  was  discovered 
in  the  MTBF  prediction.  Hie  equipment  did  not  perform 
as  well  as  predicted  (a  Case  A  situation).  It  is 
experiencing  a  MTBF  of  only  0.22  years. 

A  redesign  alternative  must  be  ofifeied  to  compete  with 
the  baseline  to  correct  for  the  deficiency.  Table  3  shows 
the  alternative's  redesign  dependent  parameters.  The 
independent  parameters  ate  not  exhibited,  since  they  are 
the  same  for  both  the  baseline  and  the  alternative. 

Optimization  of  the  REPS  system  for  the  alternative  set 
of  design  dependent  parameters  follows.  The  needed 


Table  3  -  Parameter  Values  for  Alternative 


Parameter 

Value 

Cost  of  equipment  unit 

$52,000 

Salvage  value 

$7,000 

Design  life  in  years 

6 

Operating  costs 

$1,750 

Age  cohorts 

MTBF  MTTR 

0-1 

0.20  0.03 

1-2 

0.24  0.04 

2-3 

0.29  0.05 

3-4 

0.29  0.05 

4-5 

0.26  0.06 

5-6 

0.22  0.07 

Table  4  •  Points  in  the  Optimum  Region 


Retirement 

age.»f 

Number  of 
units.  N 

Number  of  repair  duuneli,  M 

2 

3 

4 

3 

19 

$598K 

$466K 

$469K 

18 

$593K 

$465K 

$466K 

4 

19 

$600K 

$463K 

$464K 

20 

$61  IK 

$467K 

$469K 

5 

19 

$643K 

$480K 

$467K 

calculations  are  given  in  the  appendix  for  detailed 
consideration  and  study. 

Table  4  shows  points  around  the  optimum  region.  As 
shown,  the  point  for  AELCC  ~  $463K  is  opdmm  for  this 
alternative,  since  all  others  are  higher.  A  summary  of  the 
calculated  and  optimal  design  variables  for  the  alternative 
design  is  shown  in  Table  5  (also  see  the  Appendix). 

This  REPS  alternative  meets  all  est^lished  criteria 
according  to  the  estimating  methods  used.  Care  must  be 
taken,  though,  to  investigate  the  source  of  the  origimd 
error  in  predicting  the  MTBF.  If  this  prediction  error  is 


Table  5  -  Summary  Outputs  fbr  Redesign 


Output  Item 

Value 

Population  Cist  cost 

$988,000 

Annual  operating  cost 

$33,250 

Annual  repair  facility  cost 

$135,000 

Annual  shortage  penalty  cost 

$73,484 

Expected  (AELCC) 

$463,350 

Mean  MTBF 

0.26 

Probability  of  one  or  more  shrat 

0.38 

Deployed  units,  N 

19 

Repair  channels,  M 

3 

Retirement  age,  n 

4 
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not  corrected  for  use  in  the  reengineering  effort,  the  Table  6  -  System  Attritnites  for  Cmo  B 
altenuktive  may  perform  no  better  than  is  being 
experienced  with  tlw  existing  desigiL 

Comparison  of  the  alternative  design  and  the  baseline 
is  now  possible.  It  is  facilitated  by  a  Design  Evaluation 
Displ^  (DED).  As  the  DEO  shown  in  Figure  10 
illustrates,  the  existing  system  (baseline)  does  not  satisfy 
the  established  criteria.  However,  the  reengineered 
design  alternative  does  meet  the  criteria.  Although  the 
initial  procurement  cost  increased  $128K,  the  AELCC 
decreased  $SK  per  year. 


.000,000 

TaMo  7  •  Opftlmizsd  Outputs  tor  Alternatives 


38 


Figure  10  -  Design  Evaluation  Display  for  Case  A 
REPS  redesign  example  (Case  B) 


The  system  requirements  for  this  example  are  as 
follows: 

1)  Design  to  cost  -  the  dqiloyed  population  shall  have  a 
first  cost  not  exceeding  $800K. 

2)  Probability  of  shortages  •  the  probability  of  one  or 
more  equipment  units  short  of  demand  shall  not 
exceed  0.30. 

3)  Equipment  reliability  •  the  mean  time  between  fitilure 
for  equipment  units  shall  not  be  less  than  0.25  years. 

In  this  case,  siq;>pose  that  the  existing  system  must  be 
reengineeted  because  one  of  its  mission  requirements 
changed  from  a  KnBF  of  at  least  0.2  years  to  a  MTBF  of 
at  least  0.25  years.  Since  the  baseline  system  has  an 
MTBF  of  0.2  years,  the  current  deployed  system  does  not 
now  meet  all  effectiveness  criteria  (a  Case  B  situation). 

The  system  attributes  for  the  baseline  design,  and 
candidate  alternatives,  are  given  in  Table  6.  The 
baseline's  MTBF  is  shown  to  be  0.20  years,  which  violates 
the  new  requirement  It  must  be  at  least  0.25  years.  The 
alternatives  are  offered  for  comparison  with  the  baseline. 


since  both  redesigns  improve  the  system  MTBF.  Design 
variables  were  optimized  to  find  the  lowest  life<ycle  cost 
for  each  alternative.  Table  7  shows  the  optimized 
effectiveness  measures  and  calculated  values  for  both 
alternatives. 

A  comparison  of  the  two  alternatives  with  the  baseline 
is  shown  in  Figure  11.  As  this  DED  shows,  the  baseline 
design  does  not  meet  the  changed  MTBF  criteria. 
Alternative  1,  while  meeung  the  MTBF  criteria,  fails  to 
achieve  a  low  enough  prob^ilify  of  one  or  more  units 
short  Alternative  2,  however,  meets  all  criteria  and 
should  be  selected  as  the  best  redesign  alternative. 

V.  Other  reengineering  issues 

This  paper  deals  with  those  aspects  of  system 
reengineering  where  Design  Dependent  Parameter  values 
of  deployed  equipment  is  the  main  focus.  However, 
reengineering  efforts  can  concentrate  on  other  system 
foctors,  such  as  system  design  variable  values,  so  as  to 
optimize  the  system  in  operation. 
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Figura  11  •  Design  Evaluation  Display  for  Case  B 


Reengineering  of  a  system  because  of  a  defidency  may 
not  always  mean  red^gning  the  pineal  equipment 
itself.  For  example,  if  a  system  is  found  to  need 
reengineering,  one  approach  may  be  to  deploy  more  units, 
add  to  the  repair  capability,  or  othen^  change  the 
support  environment  so  that  mission  requirements  are 
more  closdy  met  It  would  be  desirable  (and  probably 
less  oostty)  for  a  system  defidetuy  to  be  resolved  by 
reoptimizing  the  support  subsystem.  However,  this  is  not 
feasible  if  the  defidency  in  the  system  is  one  traceable  to 
a  major  flaw  in  design,  or  to  a  missioa  change. 

Also  of  consideration,  but  not  discussed  here,  is  the 
extent  to  which  the  existing  system  is  deployed.  It  must 
be  determined  what  costs  for  the  existing  system  are  sunk 
and  which  ones  are  not  All  sunk  costs  caimot  be 
considered  when  comparing  the  baseline  to  competing 
alternatives.  Likewise,  the  costs  of  the  alternatives  may 
not  indude  a  total  initial  deployment  cost  if  the 
alternative  is  a  retrofit  of  the  existing  system.  For  the 
case  of  a  total  redeployment  of  equipment  as  the 
alternative,  the  salvage  vdue  of  the  existing  equipment 
can  be  taken  as  the  first  cost  of  the  new  equipment  to  help 
determine  the  cost  of  the  redesigned  alternative. 

VL  Summary  and  Condusions 

In  these  times  of  tight  defense  budgets,  the  military 
services  are  searching  for  ways  to  gain  a  sustainable 
utilization  advantage  for  existing  systems.  System 
reengineering  design,  embracing  ^stem  optimization  by 
indirect  experimentation,  has  an  excellent  chance  of 
enhandng  ^stem  acceptability  through  the  integration  of 
important  design  consideration  (performance,  cost,  and 
quality).  But  these  desiderata  may  not  be  attainabL 
unless  the  importance  of  system  evaluation  involving 


optimization  is  recognized  and  implemented  as  part  of  the 
reengineering  process. 

Optimization  may  be  formalized  for  redesign  by  the 
identification  of  Design  Dependent  Parameters  and  the 
use  of  the  Design  Evaluation  Function.  By  incorporating 
lifc<ycle  factors  into  this  function,  reengineering 
alternatives  can  be  conqrared  equivalently.  System 
improvements  through  redesign  mrty  be  idoitified  for 
implementation  by  use  of  the  Design  Evaluation  Display. 
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APPENDIX 

This  Appendix  details  a  sample  calculation  of  Aimual 
Equivalent  Life-Ctycle  Cost  (Al^CC)  for  the  alternative 
in  the  hypothetical  example  (Case  A)  of  Section  IV.  The 
AELCC  calculation  will  utilize  independent  parameters 
fi-om  Table  2,  the  dependent  parameters  from  Table  3, 
and  optimum  values  for  design  variables  from  Table  4. 
Recall  that  Armual  Equivalent  Life-Cycle  Cost  is: 

AELCC  =  PC  +  OC  +  RC  +  SC 

Each  component  of  AELCC  is  derived  next. 
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Ammuai  EquiMibia  Popuiatiom  Cast: 


Compute  tbe  book  value,  B,  of  the  units  at  a  4  year 
letiiement  age  as  an  input  to  C„. 


B»S52.0(X)-4(^^^PyriLg92j.S22.0^ 


C„-(P-BXA/P.i.n)  +  B(/) 
q,  -  ($52,000  .  $22,000X0.3155)  +  $22,000(0.10) 
-  $16,404  -  $4,740  -  $1 1,664 

PC-C„(N) 

PC  -  $11,664(19)  «  $221,616 


Ammual  Operatbig  Cast: 

OC  -  (EC  +  LC  +  PMC  +  OtherXN) 
OC«  $1,750(19)  =  $33,250 


AhmuoI  Rspair  Faciiity  Cask 

RC-Ci(M) 

RC  =  $45,000(3)  *  $135,000 


mimhfr  of  Ailed  units.  In  tbe  birth-death  prooess,  the 
of  tbe  system  is  the  number  of  Ailed  units  (state  =  0, 
1, 2....,  N).  Tbe  rates  of  change  b^ween  the  states  are  the 
breakdown  rate,  and  the  iqxur  rate,  |j„.  This  gives 


Xj,;  NX  (N-I)X  (t^-M^2)X  (N-M  +  1)X^^ _ ^ 

SlaiR  0 _ _ 2-M-2 _ M-1  M  -Nj^l _ ^N 

|i^;  p  Ip  (M  -  l)|i  Mu  Mu 


Aaaiming  itead|y-ctate  operation  of  the  system,  this  yidds 
NXP,  =  mP, 

NXP, + 2uP,  =  (u +(N  -  1)X1P, 

(N  -  1)XP,  +  3uP,  =  I2u +(N  -  2)X1P, 

(N.M+2)XPm.,  +MuPm  =  I(M-  l)u  +(N  -M  +  l)X]Py., 


2XP„.,  +MuP„  =  (Mu  +  X)P„., 
XP^.|  =  MpP M 


Annual  Shortage  Cost: 

Shortage  Cost  is  SC  =  C,[E(S)],  where  E(S)  is  the 
expected  number  of  uniA  short  The  expected  number  of 
uniA  short  can  be  found  from  the  probability  distribution 
of /r  uniA  short  Pn-  Let 


Additionally, 

N 

•  •• 

Solving  these  balance  equations  gives 


N  =  number  of  uniA  in  the  population 

M  =  mimber  of  service  channeA  in  the  repair  Acility 

X  «  Ailure  rate  of  a  unit  1/MTBF 

U  =  tqrair  rate  of  a  repair  channel,  1/MTTR 

It  =  number  of  Ailed  uniA 

P„  =  stea<ly-5AA  probability  of  R  Ailed  uniA 

Pq  =  probrAility  that  no  untis  Ailed 

Mu=  maximum  possible  repair  rate 

X„  =  Ailure  rate  when  R  uniA  alieadly  Ailed 

U„  =  repair  rate  when  R  uniA  already  Ailed 

The  Ailure  rate  of  a  unit  A  expressed  as  X  =  1/MTBF 
and  the  Ailure  rate  of  the  entire  population  when  r  uniA 
already  Ailed  can  be  expressed  as  X„  =  (N  •  r)X.,  where 
N  -  R  is  the  number  of  operational  urtiA,  each  of  which 
fails  at  a  rate  of  X.  Similarly,  the  repair  rate  of  a  repair 
chaiuiel  is  erqrressed  as  u  ^  1/MTTR,  and  the  repair  rate 
of  the  entire  repair  Acility  when  n  uniA  have  already 
failed  can  be  expressed  as 


RU  ifR  el,  2,...,  M-1 
Mu  ifR  eM,  M  +  L--.,  N 


where 

r  f-1  ifR  =0,1,2 . M 

(N-r)IrIUJ 

= ' 

7 - - 3-1-1  ifn  =  M-H,M  +  2 . N 

(N-R)lMIM^“tuJ 

These  can  now  be  used  to  find  the  steady-sAte  probability 
of  R  Ailed  uniA  as  P„  =  PqC^  for  r  ■  0, 1, 2,...,  N. 

Define  tbe  quantity  N  •  D  as  the  number  of  extra  uniA 
to  be  held  m  the  population.  For  n  =  0,  1,  2,...,  N  •  D 
there  A  no  shortage  of  uniA.  However,  when 

n=N-D-fl  a  shortage  of  1  unit  exisA 
r=N-D-«-2  ashortageof2uniAexisA 
n  =  N-D-i-3  ashortageof3uniAexisA 


An  analysis  using  a  birth-death  processes  is  employed 
to  determine  the  probability  distribution,  P„,  for  the 


r  =  N 


a  shortage  of  D  uniA  exis  A 


The  expected  number  of  units  short,  E(S),  can  be  found 
by  muldplyiag  the  number  of  units  short  by  the 
probability  of  that  occurence  and  summing  across  all 
instances  as 

E(S)  =  2yP(N-i>*7) 

j-i 

The  calculation  of  the  shortage  cost  fior  the  alternative 
in  the  Case  A  exanqrie  is  based  on  the  MTBF  and  MTTR 
values  in  TaUe  3  fat  years  1  to  4.  From  these  values,  the 
average  MTBF  and  KTITR  fat  the  populatirm  can  be 
computed  as 

MTBF  -  (0.20  +  0.24  +  0.2S»  +  0.29y4  -  0.2550 
MTTR  -  (0.03  +  0.04  +  0.05  +  0.05y4  -  0.0425 

The  fiulure  rate  of  a  unit  and  the  repair  rate  at  a  rqtair 
channel  ate  given  by 


I* 


Now 

£c,  =  27.9390 

and 

E-  ^  ~  * 

2^C.  27.9390 

for  n  ■  1,  2,...,  N  can  now  be  computed  from 
P.-PoC„“(0.0358)C^as 

Po-0.0358x  1-0.0358 

P]  -  0.0358  X  3.1665  -  0.1134 
*  • 

• 

P5  -  0.0358  xV3224  -  O.l  189 
P5  -  0.0358  X  2.5840  -  0.0925 

• 

Pig  -  0.0358  X* 0.0000  >  0.0000 
Pl9  -  0.0358  X  0.0000  -  0.0000 


1 

=  3:9215 

MTBF 

“a2550 

1 

1 

-23:5294 

**“mttr 

*0.0425 

yielding  X/^  - 1/6. 

Then,  compute  Q,  for  n  -  0, 1,...,3  as 


C. 

C, 

C, 

C, 


19i(V6)* 
19(01 
191(1/6)* 
18(11 
19l(V6)’ 
17(2( 
19((l/6)* 
161 3( 


-1 


-3.1665 

-4.7496 

-4.4856 


Next,  compute  C„  for  «  ■  4, 5 . 19  as 


191(1/6)* 
“  151313* 

_  191(1/6)* 
“  141313* 

_  191(1/6)* 
“  131313* 


-3.9813 

-3.3224 

-2.5840 


Cii  — 
C|»  = 


191(1/6)'* 

11313** 

191(1/6)'* 

01313“ 


-0.0000 


=  0.0000 


The  e)q>ected  nulhber  of  units  short  can  be  calculated  as 

E(S)  “ 

/-I  j-^ 

=  1(0. 118^ + 2(0.092^+- .. +15(0.000(9 
-L00663 

From  which  Armual  Shortage  Cost  is 
SC«CstE(S)] 

SC  »  $73,000(1.00663)  -  $73,484 

The  Anmial  Equivalent  Life-C^e  Cost  is  now 
$221,616  +  $33,250  +  $135,000  +  $73,484  -  $463,350 


The  shortage  probability  distribution  can  be  calculated 
fiom  the  P„  values  and  plotted  as  a  histogram  of  Pr(S  «  s) 
-  N  -  D  +  s.  In  this  example,  Pr(S  -  s)  -  P4  + 

Pt(S-0)  =  0.622 
Pr(S»l)- 0.119 
Pt(S  =  2)-0.093 
Pt(S-3)  =  0.067 
Pt(S-4)»0.046 
Pr(S  =  5)  =  0.027 
Pt(S  =  6)»0.015 
Pi(S  =  7)  =  0.008 
Pi(S  =  8)  =  0.003 

The  probability  of  one  or  mote  units  short  can  be  found  to 
be 

1  -  Pr(S  =  0)  =  1  -  0.62  =  0.38 
as  was  shown  in  Table  5. 
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METRICS  FOR  REENGINEERING  OF  SOFTWARE  SYSTEMS 

by 

Annette  R.  Ashton  (KS2)  &  William  H.  Farr  (BIO) 

Naval  Surface  Warfare  Center,  Dahlgren  Division 
Dahlgren,  VA  22448-5000 


INTRODUCTION 

With  the  impending  budget  cuts  hanging  over  us 
many  federal  agencies,  especially  in  the  Defense 
Department,  are  evaluating  the  efficiency  of  their 
development  process.  Reengineering  of  all  or  part 
of  a  current  software  system  is  becoming  a  way  of 
life.  Billions  of  dollara  have  been  invested  in  the 
development  of  systems  that  may  need  to  be 
modified  and  extended  to  respond  to  changing 
requirements.  Reengineering  technology  is 
necessary  if  we  are  to  benefit  from  these  extensive 
investments. 

Robert  Grady  in  his  txxA  on  software  metrics  for 
project  and  process  improvement,  [Gra92],  states, 
‘‘Because  we  cannot  economically  rq)lace  all  our 
old  software,  we  must  fmd  better  ways  to  manage 
needed  changes.  Until  we  do,  software  maintenance 
will  continue  to  represent  a  large  investment  and 
software  quality  will  not  improve.”  Grady  further 
states  that  we  spend  2  to  3  times  as  much  effort 
maintaining  and  enhancing  software  as  we  spend 
creating  new  software.  In  order  to  achieve  feats  of 
reusing  current  software  systems  one  must  better 
define  his  software  development  process  and  an 
integral  part  of  this  process  improvement  is  the 
collection,  analysis,  and  feedback  of  software 
metrics. 

“Most  of  the  total  change  to  the  software  system 
will  be  located  in  only  a  few  modules”  was  the 
finding  of  Warren  Harrison.  He  further  stated  that 
“we  were  surprised  to  find  that  10%  of  the  modules 
that  were  chwged  accounted  for  over  60%  of  the 
maintenaiKe  activity.”  [Har90]  This  is  a  good 
example  of  where  collecting  simple  metrics  gave 
the  develops  an  insight  he  might  not  overwise 
have  had. 

"Management  can  use  metrics  to  understand  and 
control  the  maintenance  process.  As  changes  are 


requested,  measurements  can  be  made,  impact 
assessed,  and  implementation  decisions  made.  The 
more  we  understand  the  impact,  the  less  risk  we 
take  when  making  each  change  and  the  better  that 
we  can  control  software  degradiuion  resulting  from 
change.”  [PfI90] 

This  paper  will  give  some  basic  definitions  of 
software  metrics  and  their  role  in  the  software 
development  lifecycle  with  special  emphasis  on 
how  they  may  be  used  in  making  decisions  related 
10  reengineering  principles.  We  will  further  give 
deflnitions,  tips  on  using  the  metrics  in  your  day- 
to-day  tasks,  and  grsqrhical  representations  of  die 
metrics  to  help  evalu^  your  software  development 
process.  In  addition,  die  paper  will  provide  some 
metrics  that  are  useful  for  determining  when  an 
existing  software  system  has  deteriorated  to  the 
point  when  reengineering  must  be  considered. 

To  set  a  framework  for  our  metrics  discussion  we 
will  break  the  software  development  process  into 
the  five  tradidonal  system  developmem  phases: 

1.  requirements  definition  and  analysis 

2.  design  development 

3.  code  and  unit  testing 

4.  system  integration  and  testing 

5.  verification  and  validadon 

6.  delivery  and  operational  use 

Questions  that  need  to  be  addressed  for  each  phase 
will  be  given  along  with  the  associated  metrics  to 
help  answer  those  questions.  More  detailed 
explanations  of  the  metrics  as  well  as  examples  of 
how  they  can  be  used  to  evaluate  the  software 
system  andArr  its  development  process  are  provided 
in  the  appendix.  This  paper  is  not  intended  to 
provide  an  all  inclusive  list  of  metrics,  but  ones 
that  we  have  found  useful  within  our  software 
development  environment.  The  important 
consideration  is  to  use  those  metrics  that  are  most 
appropriate  to  your  environment  and  the 
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development  process  it  follows.  The  reader  can 
select  any  or  none  of  the  metrics  considered  here. 
For  those  selected,  the  reader  is  free  to  adapt  the 
definitions  and  implementation  to  their  own 
process.  The  issue  is  to  get  a  handle  on  the 
development  process  and  the  products  produced, 
otherwise  demonstrated  improvement  is  not 
possible.  Measurement  provides  a  basis  of 
showing  you  where  you  are,  where  you’ve  come 
from,  and  it  provides  a  road  map  to  lead  you  to 
when  you  want  to  go! 

REQUIREMENT’S  DEFINITION 
AND  ANALYSIS  PHASE 

“Defective  requirements  are  a  dominant  cause  of 
cost  and  schedule  ovemm  in  defense  and  aerospace 
programs.”  [Hai931  Using  measurements  to  track 
progress  and  quality  of  your  requirements  may  help 
answer  some  of  the  following  questions: 

1.  “Can  we  include  all  these  new  maintenance 
requirements  and  still  meet  our  schedule  w  it  h 
our  reduced  staCrr 

2.  “Whatpercentrtge  of  the  software  timdules  can 
be  reused  if  only  these  requirements 
modifications  are  ma^7” 

3.  “If  our  goal  is  to  reuse  40%  of  the  system’s 
resources,  can  we  accomplish  all  the  needed 
modifications  to  meet  the  requirements?” 

Some  specific  requirement  metrics  that  could  help 
answer  these  questions  are: 

a.  Requirements  Size  (REQSIZE)  -  the  number 
of  requirements  for  a  given  Computer 
Software  Configuration  Item  (CSCI),  arid 

b.  Requirements  Changes  (REQCHG)  •  the 
percentage  of  requirements  of  a  specific  CSCI 
that  have  changed  (modified,  added,  or 
deleted)  from  one  baseline/revision  to 
another. 

Using  these  metrics  one  could  determine  the 
magnitude  of  the  change  and  on  the  basis  of  that 
determine  resource  allocations  for  design,  coding, 
and  testing.  To  aid  in  this  determination  one 
should  establish  a  metrics  database  to  stme  these 
types  of  metrics  so  that  better  estimates  can  be 
obtained  using  past  similar  development  projects. 
The  Appendix  contains  a  detailed  description  of 


these  metrics  and  illustrates  in  more  detail  how 
they  can  be  employed. 

DESIGN  DEVELOPMENT  PHASE 

The  following  questions  that  might  arise  during  the 
design  development  phase  of  the  lifecycle: 

1.  “Given  this  set  of  requirements  changing,  can 
we  complete  our  d^gn  within  our  given 
schedule  with  our  reduced  staff?” 

2.  “What  percentage  of  the  st^tware  design  can 
be  reus^  given  the  changed  requirements?” 

3.  “If  we  achieved  40%  reuse  last  year,  what  can 
we  do  to  increase  it  this  year?” 

Metrics  that  could  help  answer  these  questions  are: 

a.  Design  Size  (DESSIZE)  •  the  number  of 
design  units  of  a  CSCI.  By  a  design  unit  we 
might  mean  the  number  of  lines  of  Program 
Design  Language  (PDL)  or  the  number  of 
bubbles/arcs  within  a  Data  Flow  Diagram  for 
the  proposed  changes.  You  need  some  type 
of  size  measurement  along  these  lines  to 
again  determine  the  magnitude  of  the  changes. 

b.  Design  Changes  (DESCHG)  •  the  percent  of 
the  original  design  that  has  changed 
(modified,  added,  or  deleted)  from  one 
baselinefrevision  to  another.  The  same 
design  units  considered  for  DESSIZE  are 
employed  here. 

Again  these  metrics  allow  one  to  assess  the  impact 
of  the  changes  for  staffing,  schedule,  and  cost 
determination.  (See  the  Appendix  for  further 
elaboration.) 

It  is  at  this  stage  that  performance  analysis  may 
also  be  addressed  in  the  reengineered  system 
through  the  use  of  rapid  prototyping  and 
simulation.  A  number  of  new  tools  are  under 
development  (NSWCDD’s  DESTINATION. 
Advanced  System  Technologies’  QUEST  and 
Conceptual  Software  Systems’  ExpeR/T  among 
others)  that  will  take  a  design  and  generate  a 
simulation  representation  of  the  system  for 
analysis  purposes. 
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CODE/UNIT  TEST  AND  SYSTEM 
INTEGRATION/TEST  PHASES 

With  the  design  in  place,  unit  coding  and  testing 
begins  followed  by  the  system's  integration  and  its 
associated  testing.  Using  reengineering  technology 
brings  the  challenge  of  pulling  together  old  and 
new  code  into  one  software  product.  Keeping  track 
of  which  modules  are  reused  and  which  are  updated 
is  one  way  toward  raising  the  awareness  of  the 
reuse  issue.  An  awareness  of  the  current 
development  status  and  your  well-defined  goals, 
will  often  show  you  what  reengineering  steps 
must  be  taken  to  a^ieve  those  goals. 

Metrics  that  could  help  here  are: 

a.  The  software  program’s  size.  Some  specific 
examples  that  we  use  are:  KSLOC  -  the 
number  of  executable  lines  of  code  divided  by 
1000  and  KTLCX^  •  the  total  number  of  lines 
of  code,  (including  blanks  and  comments) 
divided  by  1000.  Many  different 
measurements  can  be  used  for  sizing  other 
than  the  ones  |»esented.  The  important  point 
is  to  decide  on  a  common  definition  and  then 
use  it  consistently  within  the  organization  so 
that  a  common  frame  of  refe.ence  is  obtained. 
These  measurements  can  also  be  further 
broken  down  to  whatever  level  is  appropriate 
for  the  software  system.  For  example,  the 
Computer  Software  Unit.  (CSU).  the 
smallest  compilable  set  of  code,  may  be 
appropriate. 

b.  The  number  of  modules  or  CSU’s. 
(NUMMOD). 

c.  The  percentage  of  CSU’s  changing 
(MODCHG).  This  metric  is  especially 
important  in  the  reengineering  process  if 
reusalNlity  is  emphasized. 

d.  The  percentage  of  CSU’s  tested  and  the 
percentage  of  requirements  tested.  These 
metrics  can  be  gathered  at  the  unit  test  level 
or  at  system’s  integration  testing  to  insure 
that  all  requirements  have  been  incorporated 
into  the  co^  and  adequately  tested.  Every  test 
case  must  cover  at  least  one  requirement  and 
vise-versa.  These  metrics  are  plotted  against 
time  to  determine  development  progress.  At 
the  conclusion  of  this  lifecycle  phase,  both 
metrics  should  be  at  or  near  100%. 


e.  The  Size  of  documentation  (IXXSIZE) 

f.  The  Percentage  of  documents  that  have 
changed  (DOCCHG) 

All  of  these  metrics  are  addressing  questions  related 
to  assessing  the  impact  of  the  change.  Whether  the 
impact  is  viewed  from  a  cost,  schedule,  staffrng, 
testing,  or  reuse  per:q)ective.  These  metrics  all 
provide  the  system’s  developer  an  idea  of  the  size 
and  complexity  of  the  changes  being  made.  If 
only  10%  of  the  CSU’s  are  being  reused  from  the 
previous  version,  the  management  decisions  made 
relating  to  the  effort  will  be  much  different  than  the 
ones  if  90%  were  being  reused. 

VERIFICATION  AND 
VALIDATION  PHASE 

HP  reported  that  “you  will  frnd  I  defect  post-release 
for  every  10  you  find  pre-release  during  test" 
[Gra92]  By  collecting  and  analyzing  a  simple 
metric  such  as  number  of  defects  in  HP’s  system,  a 
very  valuable  statistic  was  established.  During 
verificatioo  and  validation  (V&V)  is  a  good  point 
in  the  development  lifecycle  to  formally  keep  track 
of  software  defects. 

In  preparation  for  the  V&V  lifecycle  phase, 
questions  relating  to  staffing  and  the  level  of 
testing  are  (^ten  asked.  Among  the  questions  are: 

1.  “How  many  people  should  be  assigned  to  this 

software  system  V&V  since  only  10%  of  the 
requirements  changed  and  25%  of  the  design 
changed?” 

2.  “How  much  testing  needs  to  be  done  to  insure 
the  code  matches  the  requirements?” 

Beside  employing  the  metrics  described  in  the  code 
and  integration  testing,  since  the  system  is  now 
under  some  type  of  configuration  management 
control  one  could  collec'  metrics  relating  to 
quality.  These  may  includr  .uch  metrics  as: 

a.  Number  of  defects  (NUMPR)  per 
baselineAevision. 

b.  A  metric  that  rales  the  criticality  of  the  defect 
(ERROR  SEVERITY)  (An  example  rating 
scheme  is  enclosed  in  the  Appendix  where  we 
employ  a  1  to  6  level  rating  with  1  being  the 
worst) 
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c.  Some  type  of  metric  that  characterizes  the 
defect  type  (ERROR  CLASSIFICATION). 
This  allows  the  user  to  better  understand  the 
development  process  and  the  kinds  of  defects 
that  are  being  made.  If  one  particular  type  is 
predominating,  the  developer  could 
specifically  target  process  improvement  steps 
in  this  area.  (A  sample  classification  scheme 
that  we  employ  for  software  systems  is 
enclosed  in  the  Appendix.) 

d.  Number  of  defects  per  KSLOC  (DEFECT 
DENSITY).  This  metric  can  be  used  to 
determine  if  the  system  is  ready  for  release. 
Too  high  a  value  may  indicate  problems. 

These  are  just  a  sampling  of  the  types  of  metrics 
that  can  be  collected.  In  addition,  ^e  metrics  can 
be  cross  tabulated  against  one  another  providing  the 
user  an  additional  perspective.  An  example  would 
be  number  of  defects  broken  down  by  error 
severity. 

DELIVERY  AND  OPERATIONAL 
PHASE 

The  delivery/operational  phase  of  the  lifecycle  will 
be  the  time  to  collect  the  fiiud  set  of  metrics  on  the 
product  They  will  be  collected  for  both  before  it 
goes  to  the  end  user  (delivery)  and  during  its 
operational  use.  This  type  of  information  is  used 
to  compare  back  to  the  metrics  collected  earlier  in 
the  lifecycle  and  during  its  use  to  monitor  changes. 
For  example,  at  delivery  one  might  collect  design 
size  and  compare  it  with  the  size  at  the  completion 
of  the  design  phase.  Has  the  design  size  ch^ged? 
It  shouldn’t  have  but  many  times  changes  are  made 
further  down  in  the  lifecycle  then  they  should  be. 
Both  a  process  and  product  problem  are  indicated  if 
the  size  change  is  substantial.  After  delivery  the 
quality  of  the  product  can  be  monitored.  If  the 
quality  begins  to  go  down  to  the  point  that  further 
maintenance  is  not  advisable,  reengineering  of  the 
entire  system  may  be  warranted.  As  a  system  is 
updated  in  the  maintenance  phase  with  new 
enhancements  and  corrections  to  existing  faults,  the 
risk  of  error  introduction  and  greater  demands  for 
resources  to  make  the  updates  increase  dramatically. 

Questions  that  are  often  asked  at  this  point  in  the 
development  process  are: 


1.  “Has  the  quality  of  the  system  degenerated  to 
the  point  that  an  entire  new  development 
effort  is  indicated? 

2.  “How  fast  are  changes  being  made  to  the 
system?” 

3.  “What  level  of  support  is  required  to  maintain 
the  existing  system?” 

Meuics  to  address  these  questions  would  include  all 
of  the  quality  metrics  discussed  in  the  verification 
and  validation  phase  plus  some  of  the  requirements 
through  design  metrics.  This  latter  set  would  cover 
any  maintenance  activities  on  the  operational 
software.  In  addition  some  other  metrics  that  can 
be  used  include: 

a.  System  Reliability  •  The  probability  the 
system  will  operate  without  failure  for  a 
specified  time  under  a  specified  environmenL 

b.  Mean  Time  To  Failure  (MTTF)  -  The 
expected  time  to  the  next  failure  of  the 
system. 

c.  Availability  •  The  probability  the  system  will 
be  operational  during  a  specified  time  and  in  a 
specified  envirorunent 

These  various  metrics  can  be  plotted  against  time 
during  the  operational  use  to  determine  if  quality  is 
going  down.  For  example,  if  the  operational 
reliability  of  the  system  has  reached  a  point  that  is 
no  longer  acceptable,  a  decision  will  need  to  be 
made  if  contini^  upgrading  is  feasible  or  whether 
it  is  more  cost  effective  to  simply  reengineer  the 
entire  system. 

An  interactive  computer  program  called  the 
Statistical  Modeling  and  Estimation  of  Reliability 
Functions  for  Software  (SMERFS),  (Far93],  is 
available  to  estimate  both  Reliability  and  MTTF, 
among  others,  for  the  software  components  of  the 
system.  The  program  is  completely  machine 
transportable  and  incorporates  eleven  of  the  most 
well  known  software  reliability  models  ^rpearing 
in  the  literature.  It  allows  the  user  to  do  a 
complete  reliability  analysis  using  either  the 
number  of  software  faults  per  unit  of  time  or  the 
elapsed  time  between  fault  occurrences.  Using 
either  of  these  two  data  types,  the  user  can  input 
the  data  via  keyboard  or  file,  edit  it  if  necessary, 
transform  it,  plot  it,  do  a  preliminary  analysis  of 
the  most  appropriate  model,  model  the  data. 
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analyze  the  resulting  model  fit,  and  then  do 
reliability  predictions  and  trade-off  analyses.  A 
sample  plot  is  shown  to  illustrate  how  this 
program  could  be  used  in  a  reengineering 
framework. 

Figure  1  is  a  plot  of  the  predicted  operational 
reliability  of  a  program  over  a  1  year  period.  As 
one  can  see  the  reliability,  because  of  the  changes 
that  have  been  made  to  the  software  over  this  time 
period,  has  degraded  to  the  point  that  a  rewwk  of 
the  entire  system  may  be  necessary. 


Figure  1  Reliability 

Some  additional  metrics  that  can  be  collected  over 
all  phases  include: 

a.  Engineering  months  (NUMMON) '  the  actual 
number  of  months  the  project  took. 

b.  Staffing  time  (NUMPEO)  •  the  number  of 
full  time  people  engaged  in  the  development 
effort  This  includes  support  personnel  as 
well  as  management  time. 

c.  Average  Staffing  (AVSTAFF)  -  NUMPEO 
divided  by  NUMMON 

The  above  metrics  can  be  broken  down  by  phase  of 
the  lifecycle  and  used  to  determine  schediding  and 
staffing  requirements  for  similar  reengineering 
effoits. 

CONCLUSION 

This  paper  has  provided  suggested  metrics  for  the 
various  [diases  of  a  reengineering  effort  This  list 


is  not  all  inclusive.  It  is  intended  to  be  illustrative 
of  the  type  measurements  that  one  would  be 
interested  in.  The  metrics  selected  in  this  paper  are 
illustrative  of  the  ones  collected  in  our 
development  process.  Little  metrics  have  been 
given  relating  to  cost,  security  issues,  nor 
performance.  The  emphasis  has  been  on  quality, 
scheduling,  and  staffmg  issues.  These  areas  are 
deemed  important  within  our  environment  The 
important  consideration  is  to  define  a  set  of 
objectives  f(M'  your  develqMnent  effort  and  then  to 
sdect  a  set  of  metrics  that  will  provide  quantitative 
measures  of  how  well  those  c^jectives  are  being 
met. 
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APPENDIX:  METRICS 


Dcatrimioii; 


This  section  will  elaborate  some  of  the  metrics 
discussed  eariier  by  illusoating  how  they  can  be  used  in 
the  re-engineering  process.  In  this  section  we  will 
provide  the  metric  name,  its  dermidon  as  it  relates  to 
our  development  environment,  and  how  it  can  be  used 
to  relate  back  to  the  product  and/or  process.  An 
example  is  provided  whm  aiqxopriate  to  illustrate  the 
usage. 

Metric:  AVSTAFF 

Pcatrintion; 

The  metric  AVSTAFF  is  the  average  effort  (person- 
month)  expended  per  month  for  a  program.  It  is 
calculated  from  two  other  metrics,  NUMPEO  and 
NUMMON.  It  is  useful  for  determining  the  level  of 
staffing  required  for  funue  development  efforts  and  for 
tracking  current  efforts. 

Fccdback/Mialyaia; 

AVSTAFF  would  be  used  by  project  leaders  to 
determine  how  well  actual  stafftng  meets  the  original 
projection.  Previous  values  can  also  be  used  as  data 
points  to  create  scheduling  and  staffing  estimates  for 
f^utuie  endeavors.  Figure  3-2  shows  how  the  average 
staffing  effort  for  program  X  has  increased  over  its 
development  history,  peaking  in  the  critical  design 
phase.  The  plotting  of  this  metric  could  represent  a 
particular  phase  (e.g.  coding  and  unit  testing)  or  the 
entire  life  cycle  history  of  a  program. 


irnMnifWIMMi  OIMM  HI  MW  ITIf 


Figurn  A-1:  AVSTAFF  SIZE  AS  A 

FUNCTION  OF  TIME  FOR  PROGRAM  X 
Metric:  DEFECT  DENSITY 


Defect  density  indicates  the  number  of  software  errors 
(where  enor  severity  is  1  -  6)  per  KSLOC  per  program. 
This  metric  is  a  measure  of  code  quality  and  testing 
adequacy. 

.Snanortlna  Definitions; 

KSLOC  -  the  number  of  executable  lines  of  code 
(excluding  blanks  or  comments)  of  a  program  divided  by 
1,000. 

Error  Severity  •  a  seven  level  rating  system  for 
evaluating  the  severity  of  software  errors. 

Fccdhack/iaalyaia; 

Defect  density  provides  an  idea  of  the  complexity  of  the 
code  or  the  reliability  of  the  code.  It  also  provides  an 
indication  of  the  level  of  Quality  Assurance  testing  that 
was  performed.  A  high  defiea  density  may  indicalB  poor 
code  quality  or  a  low  density  might  indicate  inadequate 
testing  was  performed.  Management  can  use  this 
measurement  to  evaluate  the  quality  of  the  software  and 
decide  on  a  future  course  of  action.  Figure  A-2 
illustrates  how  the  defect  density  can  be  tracked  over 
time  for  a  given  program,  to  determine  how  the  quality 
is  changing  over  time. 

This  metric  can  be  plotted  against  a  baseline  (revision 
or  yearly)  for  a  given  program  to  see  if  the  quality  of 
the  program  is  going  up  or  down.  The  following  figure 
shows  what  a  report  of  this  nature  might  look  like; 


PaOCBAM  aiVISiON 


Flgur*  A-2:  DEFECT  DENSITY  FOR 
PROGRAM  X 

From  this  plot  one  can  see  that  the  defect  density  of 
Program  X  was  steadily  decreasing  over  the  first  3 
baselines  (revisions).  It  then  suddenly  went  up  on 
baseline  4.  Managemeitt  would  then  need  to  investigate 
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to  40%  of  the  DESCHG  value. 


what  led  to  this  sudden  change  (e.g.,  extensive  new 
capabilities  added,  more  testing  performed  in  contrast  to 
ea^er  versions).  If  this  trend  persisted,  more  detailed 
tracking  of  this  program  would  be  required  and 
appropriate  corrective  action  taken. 

Metric:  DESCHG 

Description  ■ 

Percent  of  software  design  that  has  changed  (modified, 
added  or  delete^  from  one  baselineAeviskn  to  another. 
e.g.  Program  Design  Language  (PDL),  Data  Flow 
Diagrams  (DFD). 

DESCHG.  SOP  CHANGES  ^ 

DESSIZE 

Feedback/analvsis; 

DESCHG  may  be  used  to  evaluate  program  stability 
and  to  estimate  resources  needed  by  knowing  how  much 
of  the  code  must  be  changed.  It  will  be  used  in 
conjunction  with  MODCHG  and  REGCHG  to  estimate 
how  many  modules  in  aptogram  will  be  changing  in 
the  future.  Caution:  Don’t  try  to  compare  DESCHG 
across  different  software.  Prc^srammers  design  software 
in  many  different  styles.  The  level  of  detail  the 
programma  chooses  will  often  vary.  Also,  don’t 
assume  that  because  the  percent  of  requirements  change 
(REGCHG)  is  a  small  number  that  die  DESCHG  will 
alsobesmalL  Figure  A-3  illustrates  how  DESCHG 
might  be  used  with  NUMPEO. 


>  »  ■  » 


FIGURE  A-3:  DESCHG 

The  figure  diqilays  historical  information  about 
previous  maintenance  efforts  for  a  piece  of  software. 
Using  this  figure  the  developer  could  predict  that  for 
DESCHG  values  <  30,  NU^EO  is  rxHighly  half  the 
DESCHG  value.  For  values  >  30,  NUMPEO  is  closer 


Metric:  DESSIZE 

Dtstripligp; 

DESSIZE:  Number  of  design  units  (e.g.,  PDLslines, 
DFDs^bblet/arcs)  of  a  CSCI. 

Feedback/aaalvuis; 

DESSI23E  is  used  to  calculate  DESCHG.  Knowing  the 
relative  sizes  of  the  software  design  is  important  when 
used  in  conjunction  with  other  metrics  to  make  an 
overall  assessment  of  die  scrftware  processi^ModucL 
Figure  A-4  illustrates  how  DESSIZE  might  be  used 
with  NUMPEO. 


FIGURE  A-4:  DESSIZE 


The  figure  displays  historical  information  about 
pervious  baseline  efforts  for  similar  pieces  of  software. 
Using  this  figure  die  develqier  could  predict  an 
estimated  NUMPEO  basing  the  estimate  on  DESSIZE. 


Metric:  DOCCHG 

Desgriptiop; 

DOCCHG:  Percent  of  pages  of  a  ^lecific  document  that 
has  changed  (modified,  added,  or  deleted)  from  cme 
baselineA'cvision  to  another. 
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DOCCHG-  i  OF  CHANGE  PAGES  ^ 

DOCSIZE 

SuBiiflrtiBg  Dcfiaitiam: 

DOCSIZE:  Number  of  pages  of  a  specific  document. 
Pggdhnek/aBalygist 

DOCCHG  will  be  used  to  evaluate  software  stability 
and  10  estimate  resources  needed  by  knowing  how  much 
of  the  documentation  has  been  changed.  FigureA-S 
illustrates  how  DOCCHG  might  be  used. 


FIGURE  A-6:  DOCSIZE 


•  M  M  M 


Doaoc 

FIGURE  A-5:  DOCCHG  FOR  PROGRAM  X 

The  figure  displays  the  percentages  that  three  documents 
describing  RogTOT  X  have  changed.  Using  this  figure 
the  tester  could  conclude  that  more  time  should  be  set 
aside  for  reviewing  document  two  since  DOCCHG  is 
higher  for  that  document. 

Metric:  DOCSIZE 

Descrintion: 


The  figure  displays  information  about  three  documents. 
Using  this  figure  the  tester  might  conclude  that  more 
time  should  be  set  aside  for  reviewing  document  one 
since  DOCSIZE  is  higher  for  that  document  If  the 
programs  are  comparable,  it  may  also  indicate  that 
program  one  is  be^  documented  than  program  three. 

Metric:  ERROR  CLASSIFICATION 

Descrintion; 

Error  Classification  is  a  code  that  categorizes  the  type  of 
software  error  diat  was  made.  The  code  will  be  made  up 
of  five  major  fields.  Each  code  conveys  information 
about  the  error.  The  scheme  is  as  follows: 

How  did  the  error  manifest  itself  to  the  user? 

01  abort 

02  program  produced  wrong  answer 
09  code  reading 

04  erroneous  error  message  produced 
OS  using  supporting  documentation 
06  using  an  analysis  tool 
07  other 


DOCSIZE:  Number  of  pages  of  a  specified  document 
Feedback/analvsis: 

DOCSIZE  is  used  to  calculate  ElOCCHG.  Knowing 
the  relative  size  of  a  document  is  important  when  used 
in  conjunction  with  other  metrics  to  make  an  overall 
assessment  of  the  software  process4>roduct  When  used 
independently,  DOCSIZE  could  indicate  when  there  is 
insuftkient  documentation  for  a  software 
process^product  Figure  A-6  illustrates  how  DOCSIZE 
might  be  used. 


Type  of  error 

01  incorrect  logic 

02  incorrect  equation 

03  invalid  sub^pt  or  counts  used 

04  data  read  in  incorrectly 

05  data  written  incorrectly 

06  variable  not  initialized  properly 

07  incorrect  variable  type 

08  module  called  with  incorrect  parameters 

09  standard  violation 

10  database 
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11  documentation 

12  generation  pfoblem 

13  violated  system  constraints 
Found  by  whom? 

01  associated  development  team 
02  associated  rV&V  team 
03NSWCDDuser 
04  non-NSWCDD  user 
How  found? 

01  operational  use 

02  product  inspection 

03  by  convertiiig/executing  program  forAxi 

anot^  platform 

04  testing 

Test  Case  Classification  (use  only  if  04  is 
selected) 

01  Operational  •  lest  that  uses  real 
data,  and  real  sequences  of  events 
02  Stress  -  test  that  simulates  a 
program's  full  load  capability 
03  Boundary  •  test  that  uses 
maximum  or  minimum  boundary  data 
04  Randomized  •  test  that  uses 
randomly  chosen  real  data 
OS  Robustness  •  test  that  uses  invalid 
data  or  procedures 

Error  was  introduced  at  what  point  in  life  cycle? 

01  requirements  definition 
02  de^ 

03  coding 
04  maintenance 


of  the  other  S  subgroups  (error  manifestation,  found  by, 
how  found,  test  case  classification,  time  error 
introduced). 


FIgurt  A-7:  ERROR  CLASSIFICATION  by 
TYPE  for  Program  X 

Each  of  these  may  be  repoiied  separately  or  in 
combinations.  Th^  can  be  reported  either  in  tabular  or 
plot  form.  For  exanqrle.  the  plotting  of  the  "Error  was 
introduced  at  what  point  in  die  life  cycle"  against  phase 
of  the  life  cycle  would  be  especially  revealing.  This 
would  show  where  the  errors  were  being  introduced  so 
that  appropriate  corrective  action  to  the  process  could  be 
applied  Also,  the  information  from  the  various  reports 
can  be  used  in  conjunction  with  one  another.  An 
example  would  be  a  high  number  of  errors  introduced  in 
the  requirernents  phase  and  were  also  prirnarily  logic 
type  errors.  This  information  tnight  be  useful  in 
determining  the  type  of  corrective  action  to  take.  Cross 
tabulations  of  the  above  categories  would  therefore 
provide  additiotud  information. 


Fegdhagk/analvals; 

Error  Classification  can  be  used  to  determine  whether 
the  particular  project  requires  mote  or  less  people  andAx 
time  resources.  It  canidso  be  used  by  the  configuration 
rnanagernent  board  and  the  Software  Process 
Improvement  team  to  identify  areas  where  process 
improvements  would  be  ben^ial.  This  metric  can  be 
us^  to  determine  if  new  initiatives  aimed  at  eliminating 
certain  errors  ate  successful  For  example,  this 
information  can  be  used  to  determine  what  types  of 
diagnostic  tools  might  be  helpful  and  to  ev^uate  the 
effectiveness  of  these  tools.  Additionally,  this  metric 
can  be  used  to  determine  what  types  of  testing  ate 
yielding  what  types  of  errors. 

Figure  A-?,  a  pie  chari  is  an  example  of  how  the 
distribution  of  error  types  might  be  reported  for 
Program  X.  Similar  charts  can  be  constructed  for  each 


Further  information  can  be  derived  by  breaking  this 
information  down  by  each  program.  Ifaparticular 
program  vras  experiencing  difficulty  relating  to  quality 
or  meeting  deadlines,  analysis  of  the  above  error 
classification  type  might  help  pinpoint  the  problem. 


Metric:  ERROR  SEVERITY 

Peacription; 

Error  Severity  is  a  seven  level  rating  system  for 
evaluating  the  severity  of  software  errors. 

1.  Critical  -  Program  cannot  be  used  until  error  is 
corrected.  Program  aborts  abnormally,  or  produces 
unreliable  results. 

2.  Serious  -  Some  necessary  portion  of  program 
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cannot  be  used  until  eiror  is  corrected.  It  is  still  possible 
to  use  other  portions  of  the  program. 

3.  Nom-eiitkat  -  Program  does  not  perform  correctly 
for  non-typical  test  cases.  Some  nonessential  feature  of 
the  program  cannot  be  used.  These  types  of  errors  can 
be  worked  around  and  are  typically  placed  on  a  shopping 
list  or  corrected  as  time  allows,  ratto  than  immediately. 
Program  is  still  usable. 

4.  Standards  rUtladoa  -  unjustified  violation  of 
standards.  Violations  that  are  intentionally  introduced 
for  a  reason  are  not  etiors.  (Intentional  violations  must 
be  documented,  giving  the  reason  why  they  are 
necessary.)  Hiese  errors  ate  not  visible  to  the  end  user. 

5.  Ckrieat  -  ^pogrqthical  errors,  misspellings, 
grammatical  errors  in  the  code  that  do  not  prevent  the 
reader  from  understanding  what  is  written.  Iftheuseris 
misled  by  the  error,  then  it  gets  a  higher  severity  rating. 
Unnecessary  code  and  variables  would  be  includ^  in 
thiscat^ory. 

6.  Documemtadaa  •  any  error,  (typographical  errors, 
misspellings,  etc.)  that  appears  in  the  supporting 
documentation,  (Software  Design  Docunient,  User’s 
Guide,  etc.),  for  a  program.  If  the  user  is  seriously 
raided  by  the  documentation  error,  then  it  should  geta 
higher  severity  rating. 

0.  No  error  •  problem  reports  submitted  because  of 
incorrect  input,  user  misiuiderstandings,  etc. 

Note:  Errors  of  severity  level  1, 2  or  3  must  be 
submitted  one  per  problem  report  For  level  4,  S,  or  6 
errors,  multvle  pr^lems  may  be  submitted  as  a  single 
problem  report 

Fetdluck/aMlysis; 

Error  Severity  can  be  used  by  a  manager  or  group  leader 
when  scheduling  tasks.  This  metric  indicates  whethn  a 
particulv  project  requires  mote  or  less  peofde  and/or 
time  resources.  It  can  also  be  used  by  the  IV&V  team 
for  the  same  reason.  The  Software  Process 
Improvement  Team  and  the  CM  board  will  use  it  to 
kVmtify  areas  where  process  improvements  would  be 
beneficial.  This  might  be  used  to  determine  if  new 
initiatives  aimed  at  eliminating  certain  errors  are 
successful. 


Figure  A-8  is  an  example  of  a  graphical  report  of  this 
metrics  for  Program  XX.  Meally  the  graph  should  have 
no  errors  of  severity  1  or  2.  For  Program  XX  there  are 
a  total  of  11  such  errors.  Management  may  question 
the  reasons  for  this  count 


FIGURE  A-8:  ERROR  SEVERITY  FOR 
PROGRAM  XX 

Metric:  KSLOC 

Descrintion; 

KSLOC  is  the  number  of  executable  lines  of  code 
(excluding  blanks  and  comments)  of  a  program  divided 
by  ICXX). 

Ectilbadt/aMlysia; 

KSLOC  can  be  applied  to  different  programming 
languages  and  compared  to  outside  organizations 
eqtecially  in  comparing  estimates  for  work  that  is  to  be 
contracted  out  KSLOC  can  be  used  in  conjunction  with 
NUMMON  and  NUMPEO  to  estimate  future  program 
development  resource  allocation/staffing.  It  also  can  be 
used  to  calculate  other  m^cs  (e.  g.  DEFECT 
DENSITY).  Knowing  the  relative  sizes  of  programs  is 
important  when  used  in  conjunction  with  o^r  metrics 
to  make  an  overall  assessmoit  of  the  software 
ptocess^[>rDducL 


A-5 


243 


Figute  A-9  illustrates  how  KSLOC  might  be  used  to 
compare  programs.  These  programs  might  be  different 
software  items  or  they  might  be  the  same  program 
tracked  over  diffeteitt  years.  The  data  can  also  be  used  to 
estimate  the  size  of  future  efforts.  If  a  new  program  is 
being  developed  that  is  roughly  comparable  to  prt^;rams 
2  and  3.  a  developer  could  estimate  a  KSLOC  ^  25  for 
the  new  program  based  on  this  data. 


I  t  I 

PtOCMM 


Figure  A-10:  EXAMPLE  KTLOC  AND 
KSLOC  REPORT  COMBINED 

Notice  how  Program  1  has  very  little  in-line  comments 
and/n-  blank  lines  while  Program  3  has  about  S0%. 
Program  1  may  be  difficult  to  maintain  and/or  test 


nOGRAMS 


Metric:  MODCHG 


Figure  A-9:  KSLOC  Sample  Report 

Metric:  KTLOC 

Ptscripliaii; 


Dtscripliop; 

I^icent  of  modules  within  a  program  that  have  changes 
(iiKxlbied,  added,  or  deleted)  bom  oiie  baselinc/revision 
to  another. 


KTLOC  is  the  total  number  of  aU  lines  of  code  of  a 
program  divided  by  1000. 

Feedback/analvsis: 

KTLOC  may  be  used  in  ctmnection  with  KSLOC  to 
determine  how  well  a  program  is  documented.  It  may 
be  used  in  determining  future  program  development 
resource  allocation  and  staffing. 

The  following  figure  is  an  example  report  that  could  be 
generated  comparing  the  KTLOC  fipom  three  programs. 
In  addition  this  repot  could  combine  the  report  of 
KSLOC  to  indicate  how  many  non-executable 
statements  there  ate.  FTOm  this  comparison  one  could 
get  an  idea  of  how  much  in-litte  documentation  and/o' 
blank  lines  were  being  put  into  a  program  and  how  this 
number  varied  across  programs. 


MODCHG  s  M  mpdulgs  chanted _  •  too 

total  #  of  modules  (NUMMOD) 

Feedback/analvsis; 

MODCHG  will  be  used  to  evaluate  program  stability 
and  to  estimate  resources  needed  by  knowing  how  many 
modules  have  changed.  It  will  be  used  in  conjunction 
with  requirements  changes  to  estimate  how  many 
modules  in  a  program  will  be  changing  in  the  future. 

Figure  A-1 1  illustrates  a  graphics  report  of  MODCHG. 
From  this  report  the  V&V  group  can  determine  whether 
testing  should  be  performed  and,  if  so,  the  extent 
From  the  Figure,  the  testing  group  might  decide  to  put 
little  or  no  effort  into  Program  1  while  Program  2  is  a 
good  candidate  for  a  complete  V&V  effort  Program  3 
may  need  some  testing  depending  upon  the  nature  of  the 
changes. 
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Flgura  A>11:  Example  Report  for 
MODCHG 

Metric:  MTTF 

PMflriBUQB*- 

Mem  tinie  to  softwaie  failure  occuirences  of  a  software 
program. 

PeedhackyaualTsis; 

MTTF  can  provide  an  indication  of  the  quality  of  the 
software.  It  can  thus  be  used  to  determine  both  the 
testing  effort  required  and  the  release  time  from  V&V 
testing.  For  example,  if  the  MTTF  is  low,  additional 
testing  many  be  required  in  order  to  increase  it  In 
addition,  management  can  use  this  metrics  to  determine 
how  much  additional  testing  is  required  to  achieve  a 
desired  level  for  this  metric.  This  metric  can  be 
computed  from  the  Statistical  Modeling  and  Estimation 
of  Reliability  Functions  for  Software  (SMERFS) 
developed  by  one  of  the  authors. 

This  metric  can  be  compared  across  program  revisions 
to  aid  in  determining  maintainability  issues.  HguieA- 
12  illustrates  a  plot  of  this  metrics  for  three  software 
revisions.  As  can  be  seen  MTTF  is  going  down.  This 
indicates  the  quali^  appears  to  be  degrading,  requiring 
additional  testing  ^ort  and/or  possible  rewrite  of  the 
software. 


itvwow 

Figure  A-12:  MTTF 
Metric:  NUMMOD 

PcstriiiUoii; 

The  number  of  modules  within  a  program.  Donot 
include  any  external  library  routines  in  your  count  of 
modules. 

fmlback/>B«lY8is; 

NUMMOD  provides  program  sizing  information.  It 
will  be  used  in  conjunction  with  KSLOC  to  estimate 
complexity  for  evaluation  of  design,  development,  or 
V&V  resource  allocation  of  a  program.  InHgureA>13 
one  can  see  that  the  number  of  modules  steadily  grew 
through  three  revisions  and  then  dropped  slightly  for 
revision  4.  For  the  VAV  team  the  amount  of  testing 
could  reflect  this  same  behavior,  ix,  testiiig  increases 
through  revision  3,  then  drops  for  revision  4. 
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FIGURE  A-13:  NUMMOD  FOR  PROGRAM 
XX 


Metric:  NUMMON 
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Dgsgriatioa: 

The  metric  NUMMON  is  the  number  of  elapsed  time 
(in  calendar  months)  it  takes  to  complete  a  project  The 
metric  will  also  be  used  with  NUMKO  to  detennine 
AVSTAFF. 

Fccdluck/gBilyaia; 


lime  AVSTAFF  continually  increased  untii  November. 
If  this  was  not  expected,  the  development  manager 
would  investigate  what  caused  this  ^ike  in  the  figure 
(changing  requirements,  time  management,  planning 
was  too  crude,  etc.)  What  should  eqiecially  be 
questioned  is  the  large  value  for  NUMPEO  beginning 
with  coding.  Poor  scheduling  may  be  indicate 


NUM^N  would  be  used  by  the  project  leaders  (0 
detennine  how  well  actual  scheduling  meets  the  original 
projection.  This  metric  can  also  be  used  as  data  points 
to  create  scheduling  and  stafiHng  estimates  for  future 
endeavors.  Note:  calendar  months  with  no  effort  are 
counted  in  NUMMON. 

For  this  example  given  in  Figure  A- 14,  the  program 
was  in  the  requirements/design  phase  until  April,  and 
completed  testing  for  implemeiitation  by  Dei^ber. 
The  number  of  months  a^  staffing  during  the  design 
phasewasrelatively  low  but  increased  after  April.  The 
management  might  question  why  this  occurred 
especially  in  the  later  phases  of  the  lifecycle. 

DATE  NUMUON  PHASE 

NUMFEO  AVSTAFF 

Jta  2  2  1.0  Roq  Ocfioiliaii  tad  Aiulyu 

Apr  3  3  1.0  Dwigii  Devdopmaa 

Oct  0  2  ZO  Sy«um  IiuagiuiaB  mi 

Nov  0  2  3.0  Varifiotiaa  nd  Validitioa 

Ok  <  3  ZO  Odivny  mi  ftoiuetiaa 

Metric:  NUMPEO 
Description! 

The  metric  NUMPEO  is  the  total  number  of  person- 
months  used  to  complete  a  program.  Be  sure  to  include 
manager’s  time  spent  reviewing  work  on  tasks  as  well 
as  any  contractor  support  given  to  tasks.  The  metric 
will  also  be  used  with  NUMMON  to  detennine 
AVSTAFF. 


DATC  NUVtMON  PHASE 


laa 

Jr 

Oct 

No* 

Dk 


NUMPEO  AVSTAFF 

2  2  10  Riq  IMSmtica  lad  Aatlyni 

1  2  1.0  Dana  DawclopaMBi 

3  3  1.0  Coda  od  Unit  Toiat 

0  3  ZO  Sanaat  laugmkn  and  Taanu 

6  2  3.0  ^aSenioa  aod  Validnka 

6  3  ZO  Dalivaiy  and  Ptodaciioa 


Figure  A-IS:  STAFFING  INFORMATION 
FOR  PROGRAM  X 


Metric:  NUMPR 


PtsciiBlioa; 

The  number  of  problem  reports  (1^)  submitted  against 
SiCSCl. 

Ftcdback/analTsis; 

Analysis  of  this  metrics  will  highlight  anomalies  that 
require  further  investigation.  e.g.,  an  unusually  hi^ 
number  of  PRs  in  a  given  period  of  time  could  indicate 
requirements  changes  are  still  being  made  aftei  ersion 
baxiine  is  established.  Use  this  metrics  to  evaluate 
program  stability-  the  key  is  that  an  unusally  large 
number  of  PRs  warrants  a  more  detailed  look  at  the 
software  development  process  to  see  if  corrective  action 
is  justified. 

Metric:  RELIABILITY 


De.scription: 


Feedback/analvsis: 

NUMPEO  would  be  used  by  the  project  leaders  to 
detennine  how  well  actual  staffing  meets  the  original 
projection.  This  metric  can  also  be  used  as  data  points 
to  create  scheduling  and  staffing  estimates  for  future 
endeavors. 

For  this  example  given  in  Figure  A-15,  the 
program  was  in  the  design  phase  until  April,  and 
completed  implementation  by  December,  during  which 


Probability  the  software  will  not  fail  in  a  specified  time 
within  a  specified  enviroiunent 

FeedLack/analvsis: 

Reliability  can  provide  an  indication  of  the  quality  of 
software.  It  can  thus  be  used  to  detennine  both  the 
testing  effort  required  and  the  release  time  from  V&V 
testing.  If  the  reliability  is  low  additional  testing  may 
be  requited  in  order  to  increase  it  In  addition. 
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managemeftt  can  use  this  metric  to  determine  how  much 
aAlitinnal  testing  ts  required  to  achieve  a  desired  level 
for  this  metric.  This  metric  can  be  computed  from  the 
Statistical  Modeling  and  Estimation  of  Reliability 
Functions  for  Software  (SMERFS)  developed  by  one  of 
the  authors. 

This  metric  can  be  compared  across  program  revisions 
to  aid  in  determining  maintainability  issues.  Figure  A- 
16  iihiaifttat  a  jdot  of  this  metric  for  three  revisions. 

As  can  be  seen  the  Reliability  is  goiitg  down.  This 
indicatBS  that  the  quality  appears  10  be  degrading, 
requiring  adtfitional  testing  effort  and/or  possible  rewrite 
of  the  software. 


FIGURE  A-16:  RELIABILITY 


Metric:  REQCHG 

DcacrintiiiBi 

Percent  of  requirements  of  a  ^tecific  CSCI  that  has 
changed  (moefified,  added,  or  deleted)  from  one 
baseline/bevision  another. 

Feedhack/anaivsis: 

REQCHG  will  be  used  to  evaluate  software  stability 
and  to  estimate  resources  needed  by  knowing  how  many 
of  the  requirements  have  changed.  Figure  A-17 
illustrates  how  REQCHG  might  be  used. 


FIGURE  A-17:  REQCHG 

The  figure  historical  informatton  about  previous 
maintenance  efforts  for  a  project  Using  diis  figure  the 
developer  could  predict  tl^  NUMPEO  is  roughly  equal 
to  REQCHG. 

Metric:  REQSIZE 

PMcrintifliu 

REQSIZE:  Number  of  requirements  of  a  CSCI. 

Fccdback/aailYsis; 

REQSIZE  is  used  to  calculate  REQCHG.  Knowing  the 
number  of  requirements  of  a  CSCI  is  important  when 
used  in  conjunction  with  other  metrics  to  make  an 
overall  assessment  of  the  software  process/jproducL 
Figure  A-18  illustrates  how  REQSIZE  mi^t  be  used 
with  NUMreO. 


Nuno 


FIGURE  A-18:  REQSIZE 

The  figure  displays  historical  information  about 
previous  baseline  effmts  for  similar  pieces  of  software. 
Using  this  figure  the  developer  could  predict  an 
estimated  NUMF^O  based  on  REQSIZE. 
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Customized  Software  Evaluation  Tools: 
Application  of  an  Enabling  Technology  for  Reengineering 

Lawrence  Markosian,  Russell  Brand  and  Gordon  Kotik* 


Abstract 

This  paper  describes  a  new  2^)ptoach  to  developing  tools 
for  verifying  source  code  compliance  with  coding  stan¬ 
dards.  The  approach  is  based  on  an  enabling  technol¬ 
ogy  for  software  evaluation  and  reengineering.  The 
key  technical  ideas  underlying  the  technology  are  to 
represent  source  code  in  the  form  of  abstract  syntax 
trees  in  an  object-oriented  database,  and  to  use  a  li¬ 
brary  of  utilities  to  analyze  software  represented  in  this 
way.  This  enabling  technology  supports  rapid  imple¬ 
mentation  of  project-specific  coding  standards.  Coding 
standards  verification  tools  implemented  this  way  can 
be  used  for  evaluating  legacy  systems  that  are  being 
reengineered,  as  well  as  for  performing  quality  assur¬ 
ance  on  the  reengineered  systems.  A  major  focus  of  the 
paper  is  the  enabling  technology,  which  is  applicable 
to  other  reverse  engineering  and  reengineering  tasks. 
The  paper  also  discusses  several  examples  of  coding 
standards  implemented  using  this  technology.  Fmally, 
we  summarize  our  experience  using  this  2q>proach. 


1  Introduction 

A  key  component  of  a  software  evaluation  process  is 
a  procedure  for  checking  source  code  for  conformance 
with  coding  standards.  Prior  to  reengineering  a  legacy 
system,  this  procedure  can  be  used  to  help  determine 
whether  the  system  should  be  reengineered  and,  if  so, 
what  reengineering  approach  should  be  used.  For  ex¬ 
ample,  the  procedure  can  help  identify  modules  that 
must  be  redesigned  or  rewritten  by  hand.  Another 
goal  is  to  ease  maintenance  of  the  reengineered  sys¬ 
tem  by  making  the  code  more  understandable  and  by 

'Reasoning  Systems,  Inc.,  3260  Hillview  Avenue,  Palo  Alto. 
CA  94304 


reducing  the  likelihood  of  defect  introduction  during 
maintenance. 

The  coding  standards  may  be  based  on  published  stan¬ 
dards  such  as  the  Software  Prxxluctivity  Consortium’s 
Ada  Quality  and  Style:  Guidelines  for  Professional 
Programmers  [1]  for  Ada,  or  they  may  be  based  on 
corporate  guidelines.  Often  there  are  project-specific 
guidelines  that  deliverable  code  must  comply  with. 

Coding  standards  typically  cover  a  wide  range  of  cod¬ 
ing  practices  including  commenting  style,  formatting, 
identifier  naming,  language  standard,  coding  idioms, 
file  organization,  and  use  of  libraries. 

While  the  simplest  of  such  coding  standards  may  be 
checked  by  tegular  expression-based  text  analysis  tools, 
others  require  a  deeper  model  of  the  implementation 
language. 

The  great  variety  of  possible  coding  standards,  in¬ 
cluding  project-specific  standards,  makes  it  difficult  to 
find  an  adequate  corrunercially-available,  off-the-shelf 
(COTS)  automation  tool.  Also,  there  is  often  little  sup¬ 
port  for  the  old  languages  and  dialects  in  which  legacy 
applications  were  written. 

This  paper  describes  an  approach  to  building  software 
evaluation  tools  that  makes  it  feasible  to  implement  a 
wide  range  of  coding  standards,  including  those  that  re¬ 
quire  semantic  analysis  of  the  program  being  analyzed. 

Section  2  describes  Software  Refinery^^  and  the  Re¬ 
fine  Language  Tools^^,  products  from  Reasoning  Sys¬ 
tems  that  support  this  approach. 

Section  3  contains  examples  of  specific  coding  stan¬ 
dards  implemented  using  this  approach.  We  provide 
examples  for  C.  We  give  an  informal  characterization 
of  each  coding  standard  and  show  its  implementation 


in  Software  Refinery. 

Section  4  summarizes  industrial  experience  with  soft¬ 
ware  evaluation  tools  implemented  using  this  approach. 


2  Enabling  technology 

We  use  a  new,  enabling  technology  for  reengineering 
in  building  tools  for  coding  standards  verfication.  The 
central  technical  ideas  underlying  the  technology  are: 

•  represent  software  in  the  form  of  abstract  syn¬ 
tax  trees  (ASTs)  in  a  persistent  object-oriented 
database; 

•  use  libraries  to  analyze  and  transform  code  repre¬ 
sented  in  this  form. 

The  coding  standards  verification  tools  ate  built  us¬ 
ing  Software  Refinery  and  the  REFINE  Language 
Tools,  reengineering  products  from  Reasoning  Sys¬ 
tems  that  incorporate  this  technology.  These  products 
run  on  SPARCstation^^.  BM  RS/6000^^  and  HP 
SKXXVTOO^*^  workstations. 

The  remainder  of  this  section  describes  the  features  of 
Software  Refinery  and  Refine/C^^,  the  REFINE  Lan¬ 
guage  Tool  for  analyzing  C  programs.  This  section 
also  discusses  the  data  model  that  Refine/C  uses  for 
representing  C  programs. 


2.1  Software  Refinery 

Software  Refinery  is  an  environment  for  developing 
reengineeiing  tools.  We  used  Software  Refinery  to 
build  the  Refine  Language  Tools,  including  Refine/C. 
For  building  the  C  coding  standards  verification  tools, 
we  used  Software  Refinery  to  customize  Refine/C. 

Software  Refinery  provides: 

•  a  parser  and  printer  generator, 

•  a  persistent  object-oriented  database. 


•  a  very-high-level  language  for  analyzing  and 
transforming  software  in  this  database, 

•  a  library  of  utilities  for  operating  on  code  in  the 
database, 

•  an  X  Wmdows-based  graphical  user  interface 
toolkit,  and 

•  a  programming  enviioiunent  that  permits  seamless 
integration  with  existing  UNIX  tools  and  network¬ 
ing  utilities. 

Figure  1  shows  the  major  components  of  Software  Re¬ 
finery.  A  complete  description  of  Software  Refinery  is 
provided  in  [3, 4,  S,  6]. 

Dlalbct^^  is  an  LALR(l)  parser  generator.  It  pro¬ 
vides  a  mechanism  for  handling  non-LALR(l)  lan¬ 
guages  such  as  COBOL. 

Parsers  developed  using  Dialect  retain  surface  syntax, 
which  includes  comments  and  formatting  information. 
A  coding  standards  verification  tool  can  make  use  of 
this  information — for  example,  to  determine  whether 
indentation  conventions  are  followed.  Also,  during 
the  code  conversion  process,  code  that  is  transferred 
into  the  reengineered  system  without  modification  can 
be  printed  exactly  as  it  appears  in  the  original  source 
program. 

Software  Refinery’s  very  high  level  language, 
REFINE^^,  is  the  implementation  language  for  our  cod¬ 
ing  standards  verification  tools.  Refine  includes: 

•  symbolic  mathematical  operations  including  first- 
order  logic  and  set  operations, 

•  a  library  of  tree  operations  such  as  copy,  compare, 
traverse  and  substimte, 

•  a  transformation  operator  for  specifying  modifica¬ 
tions  to  source  code,  and 

•  pattern-matching  against  ASTs  in  the  objectbase. 

The  Refine  compiler  supports  unit-level  incremental 
compilation  and  dynamic  linking,  which  speeds  up  the 
edit-compile-run  loop. 
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•  DIALECT™  Programaublc  language  paner/printer 

•  REFINE™  Software  analysis  and  transfonnation  system 


Figure  1:  The  major  components  of  Software  Refinery 


12  REFINE/C 

Refine/C  is  aC  reengineering  workbench  developed  by 
Reasoning  Systems  using  Software  Refinery.  RefinE/C 
supports: 

•  modelling  C  source  code  as  ASTs  in  the  REFINE 
object  base; 

•  associating  variable  definitions  with  their  refer¬ 
ences; 

•  generating  graphical  reports  for  each  C  program, 
including: 

-  structure  chart, 

-  data-flow  diagram, 

-  ftmction  control-flow  diagram  and  cyclo- 
matic  complexity  report, 

-  coding  standards  report, 

-  ftmctions  table, 

-  variables  table,  and 

-  types  table; 


•  navigating  from  any  report  directly  into  the  corre¬ 
sponding  source  code; 

•  exporting  to  a  forward-engineering  CASE  tool; 

•  PostScripts^  or  ASCII  printing  of  reports;  and 

•  handling  large  C  programs. 

Refine/C  has  a  grr^hical  user  interface  with  pull-down 
menus,  on-line  help,  multiple  windows  displaying  text, 
tables  and  graphs,  and  hyperUnking  among  reports  and 
source  code. 

The  programmer’s  guide  for  Refine/C  documents  its 
replication  programmer’s  interface  (API)[2].  The  API 
supports  customization  and  extension  of  the  tool’s  ca¬ 
pabilities. 

The  next  section  gives  an  overview  of  the  representation 
of  C  code  in  REFINE/C.  Hiis  data  representation  is  used 
by  analysis  tools  in  Refine/C  such  as  coding  standards 
verification. 


23  Representation  of  C  code  in  Refine/C 

This  section  gives  an  overview  of  how  C  programs  are 
represented  in  Refine/C. 

The  AST  of  a  C  program  is  the  basic  data  structure  used 
during  program  analysis  and  transformation  steps.  Re- 
FINE/C  provides  an  object-oriented  model  for  C  abstract 
syntax.  Object  classes  correspond  to  non-terminal 
nodes  in  the  AST.  Attributes  (slots)  hold  subtrees. 

Figure  2  shows  three  views  of  the  REFiNE/C  represen¬ 
tation  of  the  following  C  if  statement: 

if  (  players [player] .dollars  ==  0) 

{  players [player] .bet  =  0  ; 
return  ; 

} 

The  upper-left  window  contains  the  “Pretty  Print”  view 
of  the  statement-the  surface  syntax  as  it  appears  in  a 
source  code  file.  The  lower-left  shows  a  diagram  of 
part  of  the  AST  of  the  same  statement.  The  upper-right 
window  shows  the  “Frame  Print”  view  of  the  statement 
as  an  object  in  the  database.  This  object  is  the  root  of 
the  AST  that  Refine/C  creates  when  it  parses  the  C  if 
statement.  The  lower-right  window  is  a  history  window 
that  assists  in  navigation. 

REFiNEfC  models  C  if  statements  with  the 
if-stateaaent  object  class.  The  first  line  in 
Frame  Print  window  says  that  the  particular  state¬ 
ment  being  viewed  is  an  instance  of  this  object  class. 
The  if-stateiaent  object  class  has  three  attributes 
that  hold  subtrees:  if-condition,  then-part 
and  else-part.  They  arc  shown  in  the  Frame 
Print  window.  Since  there  are  no  else  actions 
in  the  example  if-statement,  the  value  of  the 
else-part  attribute  is  undefined.  The  value  of 
the  if-condition  is  an  instance  of  the  equality 
object  class  find  represents  the  condition 

players  [player]  .  dollars  ==  0 

in  the  if  statement. 

Similarly,  the  value  of  the  then-part  is  an  instance 
of  the  block  object  class  that  represents  the  block 


{  players  [  player  ]  .  bet  =  0  ; 
return  ; 

} 

in  the  if  statement. 

The  parser  creates  these  objects  and  sets  the  values  of 
their  attributes  when  it  parses  the  original  if  statement. 
The  top  part  of  the  AST  for  the  i  f  statement  is  shown  as 
a  diagram  in  the  lower-left  window  (labelled  “Abstract 
Syntax  Tree”  in  Figure  2.  The  full  AST  for  this  if 
statement  is  shown  in  Figure  3. 

The  complete  abstract  syntax  for  C  is  provided  in  the 
Refine/C  API. 


3  Implementation  of  coding  standards 

This  section  describes  several  related  coding  standards 
for  C  code  and  how  they  are  implemented  in  Refine/C. 

Tlie  first  coding  standard  we  discuss  is  the  follow¬ 
ing:  “Do  not  use  an  LVAL  unnecessarily  on  the  left 
side  of  a  comparison.”  An  “LVAL”  is  a  variable 
or  other  construct  that  can  be  assigned  a  value.  In 
C,  these  other  constructs  include  array  elements  (for 
example,  a[i]),  pointer  dereferences  (*ptr).  and 
component  selectors  (EnployRecord.name  and 
ErnployRecord->name).  The  motivation  for  this 
coding  standard  is  that  during  maintenance,  the  com¬ 
parison  operator  could  be  incorrectly  edited  to  become 
an  assignment.  This  editing  error  would  not  be  detected 
by  the  compiler,  would  not  always  generate  a  runtime 
error,  and  could  “silently”  cause  incorrect  answers  to 
be  generated  by  the  program.  For  example,  x  !  =  8 
might  be  edited  incorrectly  to  become  x  =  8  when  x 
==  8  was  intended.  Usually  this  incorrect  edit  will  not 
generate  a  compiler  error.  However,  if  the  original  ex¬ 
pression  had  been  8  !  =  x  and  this  had  been  changed 
to  8  =  X,  a  compiler  error  would  be  generated. 

The  condition  in  the  if  statement  that  we  examined 
above  violates  this  coding  standard  because  It  contains 
a  component  selector,  which  is  an  LVAL,  on  the  left 
side  of  the  equality  test. 

Figure  4  shows  the  REFINE/C  representation  of  the 
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Figure  2:  Three  views  of  the  Refine/C  representation  of  an  IF  statement 
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Figure  4:  Refine/C  representation  of  an  equality  expression 


equality  expression  in  the  if-statement.  Note 
that  the  value  of  the  argl  attribute  of  the  equality 
is  a  direct-component-selection. 

The  following  REFINE  function  is  a  predicate  that  re¬ 
turns  true  if  its  argument  is  an  LVAL: 

function  is-em-lval?  (node:  c-object)  : 
boolew  = 

IDENTIFIER-REF (node) 
or  SUBSCRIPT-EXP (node) 
or  DIRECT-COMPONENT- SELECTION (node) 
or  DEREFERENCE (node) 

IDENTIFIER-REF,  SUBSCRIPT- EXP, 

DIRECT-COMPONENT-SELECTION  and 

dereference  are  names  of  REFINE  object  classes 
in  the  abstract  syntax  of  C.  They  are  used  as  predicates 
in  the  definition  of  is-eui-lval?. 

Since  the  value  of  the  argl  attribute  of  the  equality 
is  a  direct-component-selection, 
is-aui-lval?  returns  true  for  this  node. 


Here  is  a  rule  that  checks  whether  its  argument,  a  node  in 
a  C  AST,  is  an  equality  that  violates  the  coding  standard. 

rule  check-for-lhs-lval 
(node:  c-object) 

<node ,  ' check- f or- Ihs - Ival > 

~in  * already-checked* 

&  EQUALITY  (node) 

&  is-an-lval?  (ARGl  (node) ) 

&  ~is-6ui-lval?  (ARG2  (node) ) 

— > 

<node,  '  check- for-lhs-lval> 
in  *already-checked* 

Sc  <node,  '  check-for-lhs-lval> 
in  *violations* 

This  mle  has  a  single  argument,  a  node  in  an  AST.  The 
rule  contains  a  transform,  as  indicated  by  the  transform 
arrow,  — >.  On  the  left  hand  side  of  the  transform  are 
the  preconditions  for  the  transform  to  fire.  The  precon¬ 
ditions  include  tests  that  the  node  is  an  instance  of  the 
equal  i  ty  object  class,  and  that  the  argl  attribute  of 
the  node  is  an  LVAL  while  the  arg2  attribute  is  not. 
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in  ‘already-checked* 


On  the  right  hand  side  of  the  transform  are  postcondi¬ 
tions  that  will  hold  in  the  state  that  results  if  the  rule 
fires.  One  of  the  postconditions  is  that  the  offending 
node  and  the  name  of  the  rule  that  it  violates  are  placed 
in  a  set  of  violations. 

The  example  rule  above  can  be  generalized  to  detect 
other  similarly  error-prone  uses  of  LVALs: 

rule 

check-binary- expr ( node :  C -OBJECT) 
<node ,  ' check-binary-expr> 

~in  ‘already- checked* 

&  BINARY-EXPRESSION  (node) 

&  ~LOGICAL-AND(node) 

&  ~LOGICAL-OR(node) 

&  ARGl(node)  =  the-argl 
&  ARG2(node)  =  the-arg2 
&  (is-cui-lval?  (the-argl) 
or  ASSIGNMENT ( the-argl ) ) 

&  ~is-cm-lval? (the-arg2) 

— > 

<node ,  ' check-binary-expr> 

in  ‘already-checked* 

&  <node,  ' check-binary-expr> 
in  ‘violations* 

Code  that  violates  a  coding  standard  can,  in  many  cases, 
be  converted  into  equivalent  code  that  conforms  with 
the  st^daid.  For  example,  the  following  rule  corrects 
the  order  of  elements  tested  in  equality  comparisons: 

rule  flip-equality-comparisons 
(node:  c-object) 

<node , 

' f lip-equality-comparisons> 

~in  ‘already-checked* 

&  EQUALITY  (node) 

Sc  rhs  =  ARGl  (node) 

Sc  Ihs  =  ARG2  (node) 

&  is-an-lval?  (rhs) 

&  ~is-an-lval?  (Ihs) 

— > 

Ihs  =  ARGl  (node) 

Sc  rhs  =  ARG2  (node) 

Sc  <node , 

f lip-equal ity-comparisons> 


This  mle  can  be  generalized  to  work  with  all  types  of 
comparison  operations. 

4  Experience  with  the  approach 

Software  Refinery  users  have  implemented  coding  stan¬ 
dards  for  a  number  of  languages  including  Ada,  C,  FOR¬ 
TRAN,  COBOL  and  the  PL  language  family  [7,  8]. 

As  part  of  a  project  to  automate  a  quality  assurance  pro¬ 
cess  for  C  developers  using  Microsoft/Windows^*^,  we 
recently  implemented  12  coding  standards  in  Reetne/C. 
The  rules  were  based  primarily  on  a  client’s  corporate 
style  guideline  document.  The  mles  are  being  used  in 
a  batch  processing  environment  to  simplify  the  client’s 
review  and  acceptance  process  for  code  developed  by 
its  vendors.  Implementation  of  these  mles  took  two 
days. 

We  used  Refine/Cobol^^  to  implement  coding  stan¬ 
dards  for  analyzing  control  flow  in  COBOL  programs  us¬ 
ing  PERFORM  statements.  Different  dialects  of  COBOL 
treat  nested  PERFORM  statements  differently.  This 
leads  to  a  number  of  porting  difficulties.  Also,  leav¬ 
ing  and  reentering  PERFORM  ranges  by  way  of  GOTOs 
leads  to  unpredictable  results. 
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Abstract 

The  utility  of  most  commercially  available  reverse 
engineering  tools  is  limited  by  their  ability  to  extract 
only  information  concerning  ^e  sequential  execution 
of  a  computm'  program.  With  the  exception  of  tools 
that  support  Ada  and  its  explicit  tasking  constructs, 
reverse  engineering  tools  fail  to  capture  information 
concerning  the  flow  of  information  between  tasks. 
One  of  the  primary  reasons  for  this  situation  is  thsd 
reverse  engineering  tools  only  extract  information  that 
is  explicitly  represented  in  the  syntax  of  the 
programming  language.  Since  older  programming 
languages  do  not  explicitly  represent  tasking 
constriicts,  reverse  engineering  tools  for  these 
languages  fail  to  c^ture  information  concerning  how 
these  tasks  interact.  In  this  ptqier.  we  describe  an 
approach  for  extracting  extra-linguistic  information 
fiom  the  source  code.  This  approach  was  used  to 
support  the  recovery  of  task  flow  information  from  a 
command  and  control  system  written  in  CMS-2. 

1.0  Introduction 

Commercially  available  reverse  engineering  tools 
extract  certain  aspects  of  the  design  of  a  software 
system  from  source  code.  These  tools  can  provide 
maintainers  good  insight  into  the  structure  of  a 
program  particularly  when  analysis  reports  are  coupled 
wid:  good  source  code  navigational  aids.  Fbr  example, 
using  one  of  the  family  of  tools  available  from 
Reasoning  Systems  (Refine/C,  Refine/Ada, 
Refine/Fortran,  Refine/Cobol),  a  software  maintainer 
can  intmctively  navigate  through  code  by  selecting 
different  portions  of  code  to  view  from  a  structure 
chart.  A  maintainer  may  also  begin  to  gain  insight 
into  the  potential  impact  of  changes  he  plans  to 
introduce  into  a  program  by  using  these  tools  to 
identify  areas  of  the  program  that  may  be  affected  oy 
his  change.  In  each  of  these  cases,  reverse  engineering 
tools  may  improve  the  productivity  of  a  software 
maintainer  both  by  providing  insight  into  the  structure 
of  a  program  and  by  making  relevant  portion'-  of  a 
program  readily  accessible. 


In  spite  of  the  potential  of  these  tools,  the  utility  of 
these  tools  for  many  DOD  software  systems  is  limited 
by  their  ability  to  extract  only  infonmatkMi  concerning 
the  sequential  execution  of  a  computer  program.  Real¬ 
time  military  systems  frequently  consist  of  individual 
units  of  execution  (tasks)  that  can  operate  concurrently 
on  different  processors  or  by  interleaving  their 
functioning  on  the  same  processor.  These  concurrent 
tasks  typically  exchange  both  control  information  as 
well  as  data  through  a  variety  of  mechanisms. 
However,  with  the  exception  of  tooL>  that  support  Ada 
and  its  explicit  tasking  constructs,  reverse  engineering 
tools  fail  to  capture  information  concerning  the  flow  of 
information  between  tasks.  As  a  result,  these  tools 
provide  limited  supput  for  understanding  the  structure 
of  real-time  systems 

One  of  the  primary  reasons  fear  this  situation  is  that 
understanding  design  constructs  relevant  to  the 
execution  of  concurrent  tasks  requires  more  than  an 
implementation  level  undostanding  [1]  of  the  software. 
The  syntax  of  programming  languages,  particularly 
older  legacy  languages  such  as  CMS-2  or  Fortran,  do 
nc*  make  constructs  such  as  inter-task  communication 
and  task  synchronization  explicit.  Instead,  the  inter¬ 
task  behavior  of  a  system  often  dqrends  on  the  design 
of  the  specific  operating  system  and  the  way  in  which 
the  application  code  interacts  with  ti.e  operating 
system.  Sinc'  reverse  engineering  tools  only  extract 
information  that  is  represented  explicitly  in  the  syntax 
of  the  programming  language,  tools  for  sequential 
programming  languages  can  only  extract  information 
concerning  the  sequential  execution  of  individual  tadcs. 
These  tools  will  fi^  to  crqiture  information  concerning 
how  these  tadcs  interact 

However,  people  can  often  extract  knowledge  about 
how  concurrent  tasks  interact  from  the  source  code  of 
older  systems,  even  though  such  information  is  not 
explicitly  available  in  the  syntax  of  a  programming 
language.  Capturing  this  information  requires 
knowledge  about  the  type  of  processing  model  used  by 
the  system  software  and  how  this  processing  model  has 
been  implemented  in  a  particular  system. 

In  addition,  it  is  also  necessary  to  understand  the 
idiosyncratic  techniques  used  by  a  system  to  implement 
these  constructs.  For  example,  although  tasks  may  not 
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be  explicitly  represented  through  syntactic  constructs  in 
the  code,  specific  recurring  patterns  of  code  may  be 
used  to  represent  a  task  in  a  particular  ^plication.  As 
a  result,  it  may  still  be  possible  to  recognize  those 
specific  portwms  of  code  that  implement  a  particular 
task.  Similarly,  the  specific  actions  through  which 
these  tasks  communicate  with  each  other  may  be 
implemented  through  particular  types  of  calls  to  the 
real-time  operating  system.  Interpreting  how  specific 
tasks  communicate  with  each  other  will  depend  on 
being  able  to  interpret  the  meaning  of  these  specific 
calls. 

As  can  be  iseen,  the  ability  of  a  person  to  manually 
extract  extra-hnguistic  information  from  the  source 
code  of  a  program  depends  on  his  ability  to  use 
knowledge  about  how  specific  design  constructs  are 
implemented  in  the  source  code.  Reverse  engineering 
tools  are  not  designed  to  make  use  of  such  meta-design 
knowledge.  However,  unless  techniques  ate  developed 
to  make  use  of  meta-design  knowledge,  reverse 
engineering  tools  will  fail  to  extract  more  than  the 
implementation  level  detail  of  a  program.  As  long  as 
tools  can  only  provide  limited  visibility  into  the 
structure  of  a  program,  they  will  not  be  able  to  provide 
the  insight  required  to  understand  the  design  of  a  real¬ 
time  system. 

2.0  Recovering  Task  Flows  for  the 
MCE  System 

Although  one  of  the  goals  of  this  work  has  been  u> 
develop  techniques  to  recover  the  inter-task  behavior  of 
real-time  systems  in  general,  our  initial  efforts  have 
centered  on  recovering  this  information  from  one 
system  in  paiticulai,  the  Modular  Control  Equipment 
(MCE)  system.  MCE  is  a  command  and  control 
system  written  in  the  Navy  source  language  CMS-2. 
It  also  contains  a  relativdy  small  amount  of  embedded 
assemble'  language.  The  assembler  code  is  less  than 
ten  percent  of  the  system  and  is  predominately  located 
in  the  real-time  operating  system  (RTOS).  llie  MCE 
software  runs  in  a  distributed,  multiple  CPU  hardware 
environment  The  software  consists  of  14  functional 
subprograms  that  comprise  44  CMS-2  modules.  The 
software  modules  are  distributed  across  the  different 
CPUs.  RTOS  enables  the  software  on  different  CPUs 
to  communicate,  sharing  both  data  and  control  (task 
invocation). 

Tasks  in  MCE  are  executable  units  within  a  module 
and  are  comprised  of  many  different  procedures.  Tasks 
spawn  a  variety  of  actions  on  themselves  or  other  tasks 
through  procedure  calls  to  RTOS.  These  actions 
include  scheduling  a  task,  terminating  a  task,  or 
removing  a  previously  scheduled  task.  We  have 


focused  primarily  on  developing  techniques  for 
determining  which  tasks  schedule  other  tasks  although 
this  approach  can  be  extended  to  recover  information 
regarding  other  types  of  operating  system  calls. 

Determining  the  flow  of  tasks  within  MCE  requires 
extracting  information  that  is  not  directly  available  in 
the  MCE  source  code.  Extracting  the  task  flows 
requires  extracting  the  two  primary  pieces  of 
information  required  to  understand  any  task  flow:  who 
called  a  task  and  what  task  was  called.  Neither  piece  of 
information  is  explicitly  r^resented  in  the  source  code. 
The  following  two  sections  will  describe  the  overall 
strategy  that  was  required  to  automatically  extract  this 
information  ftom  MCE. 

2.1  Determining  Tasks  Called  by  RTOS 

The  task  scheduled  by  an  RTOS  call  is  uniquely 
determined  by  a  set  of  arguments  passed  to  RTOS  by 
the  RTOS  call.  These  arguments  identify  a  nuxlule  and 
the  task  contained  in  that  module.  A  moduleAask  pair 
uniquely  identifies  a  task  in  the  MCE  system. 

In  order  to  determine  the  task  spawned  by  an  RTOS 
call,  it  i:  ...essary  to  determine  the  state  of  the  two 
variables  Jiat  uniquely  identifies  this  task  at  the 
particular  point  in  the  program  when  an  RTOS  call  is 
made.  In  some  cases  determining  the  value  of  these 
variables  is  relatively  straightforward  since  these  values 
are  set  once  and  then  remain  constant  throughout  the 
execution  of  that  module.  Other  variables,  however, 
are  set  multiple  times  within  a  module.  In  these  cases 
it  is  necessary  to  statically  evaluate  a  ptmion  of  the 
program  that  determines  the  state  of  the  variables. 

When  a  variable  is  not  preset,  its  state  can  be 
determined  by  identifying  and  evaluating  the  set  of 
statements  that  may  impact  the  value  of  that  variable. 
Algorithms  for  identifying  the  minimal  set  of 
statements  that  may  impact  the  state  of  a  variable  are 
known  as  program  slicing  [3].  We  implemented  a 
program  slicing  algorithm  to  use  in  identifying  the 
minimal  set  of  statements  impacting  the  module  and 
task  variables  within  an  RTOS  call.  For  our  purposes, 
this  technique  assumed  that  the  state  of  the  module  and 
task  variables  was  completely  determined  within  the 
scope  of  a  task  since  the  program  slicing  algorithm 
does  not  trace  data  dependencies  across  task  boundaries. 
This  assumption  was  valid  for  all  but  one  module  in 
the  MCE  system. 

For  any  particular  RTOS  call,  module  and  task 
variables  may  assume  different  values  in  different 
contexts.  Because  a  program  slice  contains  the  set  of 
all  statements  that  may  influence  the  state  of  a 
variable,  only  a  subset  of  these  statements  may 
actually  be  executed  under  a  particular  context  In  order 
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to  evaluate  each  program  slice  under  each  possible 
context,  each  syntactically  possible  execution  thread 
within  a  module  that  may  reach  a  designated  RTOS  call 
is  evaluated.  Each  of  these  evaluations  derives  a 
distinct  value  for  the  module  and  task  variables  for  the 
particular  RTOS  call.  These  values  identify  the 
maximal  set  of  tasks  that  may  be  called  by  a  specific 
source  code  RTOS  call. 

Determining  the  Calling  Task 

RTOS  calls  are  made  within  the  context  of  a 
particular  task.  A  task  is  said  to  spawn  some  action  on 
another  task  when  an  RTOS  call  is  made  within  the 
context  of  that  task.  One  of  the  difficulties  in 
determining  task  flows  is  in  determining  which  task 
spawned  a  particular  action.  This  is  because  there 
exists  no  syntactic  structure,  such  as  a  procedure,  that 
correqxmds  to  a  task  in  CMS<2.  Therefore,  one  cannot 
simply  read  the  source  code  to  determine  the  task 
containing  a  particular  RTOS  call. 

Although  no  syntactic  structure  exists  in  CMS-2 
that  cone^nds  to  a  task,  it  is  possible  to  determine 
which  task  ^wns  another  through  a  call  to  RTOS  by 
identifying  the  calling  context  for  that  call.  Because 
this  calling  context  is  associated  with  a  task,  once  the 
calling  context  is  identified,  the  task  that  spawned  this 
call  can  be  identified.  To  do  so,  we  needed  to  define  a 
set  of  recognition  rules  that  could  be  used  to  identify 
occurrences  of  MCE  tasks.  These  recognition  rules 
were  based  on  our  understanding  of  how  tasks  were 
implemented  in  MCE. 

Tasks  are  activated  in  MCE  when  an  "entry- 
procedure”  for  a  module  is  called  by  RTOS.  This 
entry-procedure  is  implemented  by  a  CMS-2  construct 
known  as  a  p-switch,  which  will  pass  control  to  one  of 
a  set  of  procedures  depending  on  the  value  of  the 
argument  passed  by  RTOS  to  the  p-switch.  Each 
procedure  to  which  an  entry  procedure  can  pass  control 
represents  the  root  procedure  of  a  different  MCE  task. 
A  task  continues  executing  in  MCE  until  the  root 
procedure  terminates. 

To  recognize  the  occurrences  of  tasks,  we  needed 
first  to  identify  objects  in  the  code  that  represented 
"entry-procedures"  for  modules.  Once  these  entry- 
procedures  were  recognized  the  root  procedures  for  each 
task  could  be  identified  by  tracing  through  the  p- 
switch.  In  wder  to  recognize  these  entry-procedures  we 
used  information  extracted  from  external 
documentation.  Since  this  documentation  was 
available  in  a  structured  format,  we  wrote  a  parser  to 
extract  the  relevant  information  from  the 
documentation.  We  used  this  information  to  select  the 
p-switch  in  a  file  that  functioned  as  the  enuy-procedure 


for  a  module.  Once  the  entry-procedure  for  a  module 
was  idenufied  we  could  identify  the  root  procedure  for 
each  task  contained  in  that  module. 

The  identification  of  the  root-procedure  for  each  task 
provided  the  knowledge  necessary  for  identifying  the 
context  of  an  RTOS  call.  As  noted  in  Section  2.1,  a 
specific  RTOS  call  may  be  made  within  different 
contexts,  resulting  in  different  values  for  the  module 
and  task  variables  and  hence  spawning  different  tasks. 
Each  calling  environment  contains  a  root  procedure  that 
corresponds  to  the  calling  task.  Therefore,  determining 
the  task  that  spawned  a  new  ta^  requires  determining 
the  calling  environment  for  a  particular  RTOS  call 
passed  a  specific  set  of  module/task  values.  This  was 
done  as  part  of  evaluating  each  execution  thread 
through  a  program  slice. 

2J  Recognizing  Design  Constructs 

The  overall  strategy  for  determining  task  flows 
required  the  implementation  of  a  set  of  recognition 
rules  that  identified  a  small  set  of  design  constructs 
(e.g.,  tasks  and  modules)  in  the  MCE  code.  This 
information  was  then  supplemented  with  techniques  for 
evaluating  the  states  of  specific  variables  in  the  code 
that  identified  the  tasks  spawned  by  a  particular  RTOS 
call.  These  evaluations  requir^  determining  the 
program  slice  for  the  module  and  task  variable  in  each 
RTOS  call.  This  program  slice  was  evaluated  within 
the  calling  environment  of  each  task  within  the  module 
containing  the  RTOS  call.  This  evaluation  returned  a 
value  for  the  module  and  task  variables  together  with 
the  calling  environment  in  which  these  values  were 
computed.  These  values  identified  the  task  spawned  by 
a  particular  RTOS  call  and  the  calling  environment 
identified  the  task  spawning  the  new  task.  In  this 
section,  we  will  describe  the  approach  we  implemented 
for  recognizing  design  constructs  in  MCE  source  code. 
In  the  following  sections,  we  will  describe  how 
program  slicing  was  implemented  and  how  a  program 
slice  was  evaluated  to  determine  the  task  flow. 

The  purpose  of  design  construct  recognition  is  to 
identify  instances  of  the  design  constructs  implemented 
in  a  software  system  and  their  interrelationships.  We 
created  a  domain  model  that  identifies  both  a  small  set 
of  design  constructs  in  the  MCE  code  (e.g.,  tasks  and 
modules)  and  a  small  set  of  events  through  which  tasks 
interact  with  each  other  (e.g.,  task  spawning).  The 
current  implementation  hard  codes  recognition  rules  for 
these  design  constructs.  Each  recognition  rule  creates 
an  instance  of  an  abstract  design  construct  or 
determines  the  value  of  one  of  its  attributes. 

Because  of  the  difficulty  of  recognizing  these 
abstract  design  constructs  from  information  contained 


solely  in  the  source  code,  we  implemented  recognition 
rules  that  operated  on  both  design  documentation  and 
the  parsed  representation  of  the  source  code.  We  were 
able  to  identify  a  portion  of  the  on-line  documentation 
for  MCE  which  described  each  of  the  44  modules  of  the 
system.  For  each  module,  the  module’s  name  and  a 
list  of  tasks  was  listed.  A  list  of  files  relevant  to  the 
module  and  the  file  that  contained  the  entry  procedure 
for  the  module  were  also  identified. 

Although  this  documentation  was  written  in 
English,  it  was  fairly  structured.  Thus,  with  a 
minimal  amount  of  editing,  we  were  able  to 
automatically  parse  the  documentation  using  a 
recursive-descent  parser  written  in  Refine.  The  parser 
automatically  created  module  and  task  objects  for  each 
module  and  task  identified  in  the  documentation.  The 
module  and  task  names,  and  the  list  of  relevant  files  for 
each  module,  were  also  set  automatically  during 
parsing. 

After  obtaining  as  much  information  about  modules 
and  tasks  as  possible  from  the  documentation,  we 
turned  to  the  source  code  to  complete  the  model.  As 
noted  in  Section  2.1.  each  module  is  associated  with  an 
entry  procedure.  Because  the  documentation  only 
identifies  the  name  of  the  file  containing  a  module 
entry  procedure,  we  needed  to  find  this  procedure  from 
the  source  code.  This  is  done  by  generating  the 
procedure  calling  hierarchy.  The  module  entry 
procedure  is  equivalent  to  the  root  procedure  in  the 
procedure  calling  hierarchy.  To  avoid  orphan 
procedures,  the  root  of  the  largest  disconnected 
subgraph  is  used.  As  stated  in  Section  2.2.  the  module 
entry  procedure  contains  a  CMS-2  construct  called  a  p- 
switch.  The  p-switch  passes  control  to  the  entry 
procedure  for  a  particular  task  depending  upon  the  value 
of  the  p-switch  variable.  Therefore,  from  the  p-switch 
we  were  able  to  determine  the  names  of  the  task  entry 
procedures  for  each  of  the  tasks  in  that  module. 

Once  modules,  tasks,  and  their  entry  procedures 
have  been  recognized,  it  is  possible  to  determine  the 
behavior  of  each  task  by  identifying  and  interpreting 
RTOS  system  calls  used  by  a  task.  Our  domain  model 
represents  each  of  the  events  produced  via  an  RTOS 
call  and  its  associated  attributes.  We  implemented 
event  recognition  algorithms  that  identify  occurrences 
of  these  events. 

The  first  step  is  to  find  all  of  the  RTOS  calls  in  the 
source  code.  This  is  easy  to  do  by  traversing  the 
abstract  syntax  tree  and  testing  for  the  name  RTOS  in 
each  procedure  call  object  encountered.  The  next  step 
is  to  evaluate  the  value  of  the  arguments  used  by 
RTOS  to  determine  the  task  behavior  the  RTOS  call 
represents.  An  RTOS  call  has  two  arguments,  the  type 
of  the  RTOS  call  and  a  table  (a  CMS-2  data  structure) 


containing  information  for  that  type  of  call.  The  fields 
in  the  table  vary  depending  on  the  type  of  RTOS  call 
invoked.  For  example,  if  the  RTOS  call  schedules  a 
task,  then  the  table  includes  two  fields  which  contain 
the  information  necessary  for  the  operating  system  to 
determine  which  task  to  schedule.  For  each  RTOS  call 
identified  in  the  code,  the  first  argument  identifying  the 
type  of  RTOS  call  is  accessed  and  the  appropriate  event 
object  is  created  to  represent  the  event.  Our  algorithm 
then  determines  the  task  invoking  this  event  and  the 
values  of  designated  fields  in  the  table  to  determine  the 
value  of  the  event’s  attributes.  This  is  done  by 
computing  and  evaluating  a  program  slice  fm  the 
relevant  fields  in  the  table. 

2.4  Implementing  Program  Slicing 

A  program  slice  on  some  variable  v,  or  set  of 
variables,  at  statement  n  consists  of  those  statements 
that  contribute  to  the  value  of  v  just  before  statement  n 
is  executed.  In  the  current  implementation,  we 
compute  a  program  slice  from  a  data  flow  graph. 

A  data  flow  graph  is  constructed  by  identifying  a  set 
of  "reaching  definitions"  for  each  variable  used  in  a 
program.  Statement  m  is  a  teaching  definition  for 
variable  v  used  by  statement  n  when  statement  m 
defines  the  value  of  v  actually  used  at  n  through  some 
execution  path.  Note  that  a  variable  v  in  statement  n 
may  have  several  reaching  definitions  under  different 
execution  paths,  n  "backward  depends*  on  m,  and  m 
"forward  depends"  on  n.  A  backw^  (forward)  program 
slice  is  computed  on  statement  n  by  taking  the 
transitive  closure  of  all  backward-depends  (forward- 
depends)  relations  on  statement  n. 

2.4.1  Intra-procedural  data  dependence 
analysis:  The  first  step  needed  to  generate  an  intra- 
proc^ural  data  flow  graph  is  to  generate  a  control  flow 
graph  (CFG).  A  control  flow  graph  for  a  procedure  is  a 
directed  graph  that  contains  an  initial  node  which 
represents  the  entry  point  for  a  procedure  and  a  final 
nt^e  which  rqiresents  the  procedure's  exit  point  and  a 
set  of  remaining  nodes  that  each  represent  sequences  of 
simple  statements  in  the  procedure  rqiresented  by  the 
CFG.  Each  edge  in  the  graph  represents  a  possible 
flow  of  control. 

The  next  step  in  data  dependence  analysis  is  to 
identify  the  reaching  definitions  for  each  location  used 
in  a  procedure.  The  term  location  is  used  instead  of 
variable  because  it  is  necessary  to  keep  track  of  arrays 
and  data  structures.  Each  node  in  a  CFG  is  m£q)ped  to 
a  set  of  locations  defined  and  a  set  of  locations  used  in 
the  statement  represented  by  a  node.  There  exists  a 
reaching  definition  between  a  definition  and  a  use  of  a 
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location  if  there  is  a  path  in  the  CFG  between  the  node 
that  contains  the  definition  of  the  location  and  the  node 
that  contains  the  use  of  the  location.  Since  a  location 
may  be  defined  multiple  times  within  a  procedure,  there 
are  many  potential  candidates  for  the  definition  that 
actually  reaches  a  use  of  a  location  at  a  statement.  It  is 
possible  for  a  location  to  have  several  reaching 
definitions  because  the  definitions  for  that  location  are 
in  the  body  of  conditionals.  However,  one  definition 
can  also  cancel  another,  eliminating  the  canceled 
definition  as  a  reaching  definition  for  all  subsequent 
uses  of  that  location. 

2.4.2  Inter-procedural  data  dependence 
analysis:  We  extended  the  concept  of  reaching 
definitions  to  take  into  account  reaching  definitions 
between  statements  contained  in  different  procedures. 
Our  extensions  only  consider  reaching  definitions 
contained  within  the  scope  of  a  single  task.  Reaching 
definitions  that  occur  between  tasks  are  not  considered 
by  our  algorithm.  Inter-procedural  data  flow  analysis 
considers  both  global  variables  and  parameter  passing 
between  procedures. 

In  order  to  support  inter-procedural  data  flow 
analysis,  the  process  is  done  in  several  steps.  First, 
the  control  flow  graph  is  generated.  Second,  within 
each  procedure  the  definitions  and  uses  of  a  location  are 
computed.  Third,  reaching  definitions  are  computed  for 
all  locations  used  in  a  procedure.  Finally,  the  relations 
forward-depends  and  backwards-depends  are  computed. 
During  this  process  the  reaching  definitions  for  global 
variables  are  found.  Each  step  is  done  for  all 
procedures,  via  a  post-order  traversal  of  the  procedure 
calling  hierarchy. 

The  inter-procedural  reaching  definitions  for  a  global 
variable  use  can  be  in  found  one  of  three  places:  within 
the  procedure  (an  intra-procedural  reaching  definition), 
in  a  procedure  called  by  the  procedure,  or  in  a  procedure 
which  calls  the  procedure.  Each  of  these  are  considered 
in  order.  First,  the  reaching  definitions  within  the 
procedure  are  considered.  If  the  reaching  definition  is  a 
regular  assignment  statement,  that  statement  is 
returned.  Second,  a  procedure  call  contains  the  reaching 
definition,  if  the  global  variable  was  defined  within  that 
called  procedure.  The  called  procedure  must  be 
investigated  to  find  the  assignment  statement  which  is 
the  actual  reaching  definition.  The  intra-procedural 
information  for  all  global  variables  defined  within  a 
procedure  is  summarized  in  the  unique  exit  node  of  the 
CFG,  so  it  is  easy  to  access.  Finally,  if  no  definitions 
are  found  within  the  procedure  or  a  called  procedure, 
then  all  procedure  calls  related  to  the  procedure  must  be 
investigated.  This  process  is  a  recursive  one, 
traversing  the  calling  hierarchy  as  needed. 


If  the  variable  is  a  formal  parameter,  and  is  not 
defined  within  the  procedure  then  the  reaching 
definition  must  be  the  one  implicit  in  parameter 
passing.  Therefore  all  of  the  calls  to  the  procedure  are 
the  reaching  definitions.  This  will  work  for  call  by 
value  parameter  passing.  The  issue  of  aliases  (call  by 
reference)  or  other  parameter  passing  schemes  has  not 
been  investigated. 

An  example  of  a  program  slice  is  shown  below, 
with  emphasis  on  the  inter-proceduial  data  flow. 

int  some_global_variable; 

intpO 

( 

int  i  =  0,  z,  X  =  1,  y  =  2; 
z  =  X  *  y; 
if(i  =  0) 

{ 

i  =  5; 

1 

else 

i  =  6; 

) 

t(0); 

z  =  i  +  some_global_variable; 
return  z; 


int  t(x) 
int  x; 

( 

some_global_variable  ®  1; 


The  program  slice  on  z  at  return  z  is: 

p:  i  =  5 
p:  i  =  6 
p:  t(0) 

p:  z  =  i  +  some _global_variable 
p:  return  z 

t:  some_global_variable  =  1 

In  this  program  the  value  of  z  at  return  z  is 
computed  using  the  previous  statement,  therefore  the 
value  of  z  depends  upon  some_global_variable  and  i. 
The  value  of  i  depends  only  on  statements  in  procedure 
p.  i  is  set  conditionally,  so  both  assignments  appear 
in  the  slice.  If  conditionals  were  included  in  the  slice, 
i  =  =  0  would  also  appear.  Note  that  even  though  it  is 
easy  to  determine  that  /  does  equal  0  and  therefore  i  =  5 
is  executed  and  not  i  =  6,  both  still  appear  in  the  slice 
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because  there  is  no  way,  in  general,  to  determine 
statically  what  will  happen  when  the  program  is 
executed.  The  value  of  some  global  variable  is  set 
in  procedure  t  which  is  called  by  p.  Hierefote  the  call 
to  I  and  the  assignment  are  included  in  the  slice  by 
using  int^-proceduial  data  flow  analysis. 

2  J  Evaluating  Execution  Threads  Through  a 
Program  Slice 

Once  a  program  slice  is  available  it  is  possible  to 
evaluate  the  slice  to  determine  the  possible  values  of 
the  table  fields  used  by  an  RTOS  system  call. 
Evaluation  of  a  slice  is  made  somewhat  easier  in  CMS- 
2  because  procedure  invocation  does  not  introduce  a 
new  scope.  All  variables  in  a  CMS-2  program, 
including  formal  arguments  of  procedures,  are  global. 

Inputs  given  to  the  evaluator  ate  the  variables  of 
interest,  a  list  of  the  variables  for  which  values  are 
requested,  and  the  statement  of  interest  (i.e.,  the 
statement  in  the  slice  for  which  the  variable  values 
should  be  evaluated).  The  execution  of  a  slice  is  guided 
by  a  pre-order  traversal  of  the  procedure  calling 
hierarchy.  As  it  is  traversed,  each  procedure  that  is 
encountered  may  contain  some  statements  that  are 
found  in  the  slice.  They  are  evaluated  in  the  order  in 
which  they  occur  in  the  procedure  (i.e.,  statements  are 
sorted  by  line  number)  and  their  values  ate  saved  for 
use  in  other  computations  of  the  evaluator.  When  the 
statement  of  interest  (the  RTOS  call  in  this  case)  is 
encountered,  the  values  of  the  variables  of  interest  are 
checkpointed.  If  the  statement  is  encountered  again, 
the  values  at  that  time  ate  also  checkpointed.  Sets  of 
checkpointed  values  together  with  the  calling 
environment  for  each  set  are  returned  from  the 
evaluator. 

Given  the  values  for  the  moduleAask  pair  and  the 
calling  environment,  it  is  possible  to  compute  the 
calling  tasks  and  the  called  tasks  of  the  RTOS  call.  A 
calling  task  is  one  whose  entry  procedure  is  contained 
in  the  calling  environment.  A  called  task  is  the  one 
that  corresponds  to  the  moduleAask  pair.  If  the  module 
is  the  same  as  the  one  under  investigation,  the  task 
number  is  an  index  into  the  module’s  entry  procedure 
p-switch  statement,  which  can  be  thought  of  as  a  list 
of  all  intra-module  task  entry  procedures.  If  the  module 
number  corresponds  to  another  module,  then 
information  from  the  documentation  is  used  to 
determine  the  name  of  the  task  so  that  it  can  be 
displayed  in  the  task  flow  graph  or  table. 


2.6  Results 

We  have  evaluated  task  flows  for  29  of  the  44  MCE 
modules  (the  others  contain  classified  information). 
The  results  so  far  are  very  encouraging.  The  capability 
described  above  has  produced  the  completely  correct 
graph  for  8  modules.  For  14  modules  we  generated  a 
graph  that  is  slightly  different  ("90  %  similarity)  from 
the  graphs  produced  by  hand.  Results  are  not  currently 
available  for  the  other  7  modules. 

There  are  several  factors  to  account  for  the 
disciepaiKies  in  these  results; 

1.  Bugs  in  the  program  slicer  ot  evaluator. 
The  program  slicer  and  evaluator  are  under 
development  and  could  still  contains  bugs. 

2.  Limitations  of  the  program  slicer  or 
evaluator.  The  program  slicer  and  evaluates 
do  not  handle  all  conditions.  For  example, 
the  program  slicer  does  not  itrclude  conditional 
statements  in  a  slice.  When  the  slice  is 
evaluated  without  these  conditionals,  this 
could  cause  certain  possible  variable  values  to 
be  eliminated,  which  could  easily  explain 
missing  arcs  in  the  graph.  Arrother  limitation 
is  that  the  program  slicer  currently  does  not 
handle  recursive  procedures.  This  makes  it 
impossible  to  generate  graphs  for  two 
modules. 

3.  Limitations  of  the  tasking  graph  generator. 
In  at  least  one  module  there  exists  an  RTOS 
call  whose  table  argument  is  a  variable,  not  a 
pointer  to  a  table.  To  handle  this  discrepancy 
a  second  program  slice  must  be  generated, 
which  currently  is  not  done.  Also,  as  stated 
before,  this  technique  assumed  that  the  state  of 
a  module  and  task  variable  were  completely 
determined  within  the  scope  of  a  task.  This  is 
not  the  case  in  at  least  one  module.  Both  of 
these  limitations  cause  missing  arcs  in  the 
module's  graph. 

4.  Modules  are  too  large.  Currently,  we  do 
not  have  enough  computer  resources  to 
generate  the  data  flow  information  for  four  of 
the  modules.  For  example,  one  of  these 
modules  has  a  parse  tree  on  the  order  of  100 
megabytes  of  memory,  before  data  flow 
information  is  generated. 

5.  Incorrect  hand  generated  tasking  graphs. 
For  at  least  two  modules  the  graphs  produced 


automatically  were  conect  and  the  hand 
generated  ones  were  missing  arcs. 

We  are  investigating  all  of  these  avenues  in  order  to 
improve  our  results.  The  program  slicer  is  currently 
being  enhanced  to  handle  recursive  procedures.  With 
respect  to  assignments  in  conditionals,  we  are 
investigating  a  measure  to  recognize  when  this  is 
occurring.  Eventually  both  the  program  slicer  and  the 
evaluator  must  be  enhanced  to  handle  conditional  code. 
Additionally,  the  tasking  graph  generator  must  be 
extended  to  handle  table  specifications  in  RTOS  calls 
via  variables.  To  handle  the  modules  that  are  too  large 
we  are  adding  more  memory  for  our  machines.  The 
main  area  of  focus,  however,  will  be  to  make  the  data 
flow  analyzer  more  efficient  in  its  use  of  resources. 
We  also  need  to  investigate  more  of  the  graphs  to  see  if 
the  hand  generated  graphs  are  indeed  correct.  Lastly,  we 
continue  to  futd  and  fix  bugs. 

There  are  several  factors  that  can  be  used  to  measure 
the  success  of  this  effort.  First,  automating  the 
process  to  generate  task  flow  graphs  reduces  the  level 
of  effort  required  to  generate  these  graphs.  The  process 
described  above  will  take  less  than  an  hour  to  produce  a 
graph  once  the  data  flow  analysis  has  been  done.  The 
time  required  to  produce  the  same  graph  by  hand  is  on 
the  order  of  sevei^  days  to  weeks.  The  reduced  level  of 
effort  to  generate  task  flow  gr^hs  will  be  particularly 
useful  during  maintenance  of  the  MCE  system.  New 
versions  of  the  source  code  are  being  released  on  a 
regular  basis.  When  the  new  version  of  the  source  code 
becomes  available  it  could  be  a  matter  of  several  days 
to  automatically  regenerate  all  of  the  gnqths  for  each 
module.  If  these  graphs  needed  to  be  generated  by  hand 
it  could  require  significantly  more  time  for  a  maintainer 
to  notice  how  changes  between  versions  impact  the 
task  flow. 

Second,  the  discrepancies  between  automatically 
generated  documentation  and  the  manually  generated 
documents  point  to  possible  errors  in  the  manually 
generated  reports.  As  a  result,  automatically  generating 
documentation  will  ensure  that  the  documentation  that 
is  available  to  the  maintainer  is  more  reliable  than  may 
otherwise  be  possible. 

Finally,  this  process  generates  on-line 
documentation  for  maintainers  which  is  integrated  with 
the  source  code  listing.  As  a  result,  it  can  be  used  for 
source  code  navigation  and  can  be  integrated  into  other 
reports  via  hyperlinks. 

3.0  Conclusions 

The  real-time  tasking  tool  implemented  for  MCE 
demonstrates  that  a  significant  level  of  design  recovery 


can  be  obtained  when  a  small  amount  of  design 
knowledge  regarding  a  system  is  encoded  into  a 
powerful  set  of  tools  and  then  applied  in  an  analysis 
across  the  entire  system.  The  current  implementation 
hard  codes  recognition  rules  for  a  small  set  of  MCE 
relevant  objects  (e.g.,  tasks  and  modules)  and  a  small 
set  of  MCE  relevant  events  (e.g.,  tasking  spawning  via 
RTOS  calls).  The  notion  of  "objects"  of  interest  in  a 
program  and  "events"  of  interest  that  relate  objects  to 
each  other  [2]  is  a  generic  way  to  view  the  design  of  a 
software  system.  In  the  future,  we  intend  to  build  a 
framework  that  supports  the  recognition  of  objects  and 
events  in  an  attempt  to  capuire  what  can  be  termed  the 
architecture  of  a  software  system.  This  recognition 
framework  will  support  the  specification  of  recognition 
rules  for  object  and  event  types  (versus  hard  coded 
system  specific  rules)  and  provide  powerful 
visualization  facilities  for  the  set  of  events  recognized 
in  a  program. 

The  techniques  used  in  the  current  system  are  all 
static  analysis  techniques  and  thus  are  inherently 
limited  by  the  degree  to  which  static  analysis  can  be 
used  to  evaluate  run  time  behavior.  We  have 
implemented  a  program  slicing  technique  that  enables 
static  evaluation  of  program  values  where  feasible. 
This  capability  is  currently  limited  in  its  ability  to  deal 
with  name  aliasing  (an  intractable  problem)  but  we  are 
continuing  to  increase  its  abilities  to  resolve  aliases 
and  statically  evaluate  resulting  program  slices. 

By  applying  powerful  program  analysis  capabilities 
in  concert  with  recognition  rules  derived  from  some 
basic  system  design  knowledge  a  significant  level  of 
system  design  recovery  can  be  achieved.  The 
information  is  derived  directly  from  the  source  code  and 
traceable  back  to  that  source  code.  As  software 
baselines  change,  design  recovery  can  be  reapplied  to 
produce  current  design  information,  yielding  a  form  of 
"living"  documentation  that  can  reliably  aid  in  program 
maintenance  and  understanding. 
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Business  process  reengineering  is  generating  consid¬ 
erable  interest  in  the  business  community  because  it  can  re¬ 
duce  costs  and  increase  an  organization’s  responsiveness  to 
competitive  challenges  and  opportunities.  The  software 
maintenance  crisis  -  escalating  costs  and  sluggish  respon¬ 
siveness  to  changing  requirements  -  is  not  unlike  the  corpo¬ 
rate  problems  business  process  re-engineering  has  been 
turning  around.  This  article  applies  business  process  re-en¬ 
gineering  principles  and  insights  to  the  problem  of  software 
maintenance.  Three  of  its  principles  are  explored  in  depth: 
surfacing  tacit  assumptions  and  rules,  defining  processes, 
and  using  information  technology  as  an  enabler.  In  particu¬ 
lar,  three  integrated,  multi-tool  CASE  environments  are 
proposed  as  the  enabling  technological  infrastructure  for 
maintenance  process  re-engineering.  These  environments 
include  the  Software  Specification  Environment,  the  Soft¬ 
ware  Engineering  Environment,  and  the  Software  Re-engi¬ 
neering  Enviroiunent  Scenarios  of  operation  illustrate  how 
the  integrated  environments  leverage  human  resources,  not 
only  for  maintenance  activities,  but  also  for  the  phased  for¬ 
ward  development  method  described  in  DoD- 
STD-2 167A,  for  the  prototyping  and  evolutionary  develop¬ 
ment  methods  proposed  in  the  Software  Design  Document 
(DoD-STD-SDD),  and  for  software  development  based  on 
domain  reuse  libraries. 

1.  Introduction 

Maintenance  consumes  up  to  70-75  percent  of  a  soft¬ 
ware  system’s  cost  over  its  lifetime,  dwarfing  the  phases  of 
requirements  analysis,  specification,  design,  implementa¬ 
tion,  testing,  and  production  (Bloom,  1990;  CSTB  Report, 
1990).  This  statistic  is  even  mote  stunning  when  combined 
with  the  opportunity  cost  of  diminished  funds  for  replacing 
existing  systems  and  lengthened  payback  periods  for  justi¬ 
fying  new  systems.  Escalating  costs  and  an  inability  to  re¬ 
spond  quickly  to  changing  requirements  also  characterize 
many  problems  found  in  business  organizations.  Squeezed- 
by  rising  costs  and  global  competition,  many  corporations 
have  turned  to  business  process  re-engineering  (BPR). 

'Also  affiliated  with  Diexel  Univeisity 
^Also  affiliated  with  University  of  Pennsylvania 


Business  process  reengineering  has  achieved  radical  break¬ 
throughs  in  solving  seemingly  intractable  organizational 
problems  (Hammer  and  Champy,  1993;  Stewart,  1993; 
Hall,  et  al..  1993).  This  article  describes  the  application  of 
BPR  thinking  to  the  intractable  problem  of  mainteiuuice  in 
the  software  life  cycle. 

BPR  is  defined  as  the  fundamental  rethinking  and  rad¬ 
ical  redesign  of  business  processes  to  achieve  dramatic  im¬ 
provements  in  performance  measured  in  such  areas  as  cost, 
quality,  service,  and  speed  (Hammer  and  (Thampy,  1993). 
A  process  is  defined  as  a  collection  of  activities  that  takes 
one  or  more  kinds  of  input  and  creates  an  output  that  is  of 
value  to  the  customer.  We  begin  the  fundamental  rethinking 
of  maintenance  in  Section  2  by  applying  the  BPR  principle 
of  examining  tacit  assumptions  and  rules  for  error,  obsoles¬ 
cence  or  irrelevance.  This  examination  revealed  that  pre¬ 
vailing  assumptions  about  software  maintenance  regard  it 
either  as  separate  and  distinct  from  software  development 
activities,  or  as  an  activity  that  can  be  accomplished  largely 
through  redesigning  software  and  generating  new  code  with 
forward  CASE  tools.  Consequently,  CASE  technology  has 
been  neglected  that  integrates  domain  and  application  engi¬ 
neering  for  software  reuse  and  forward  engineering,  with 
the  maintenance  activities  of  re-engineering,  i.e.  reverse 
engineering,  program  understanding  and  restructuring.^ 

Fundamental  rethinking  is  continued  in  Section  3 
where  we  apply  the  BPR  principle  of  defining  processes. 
This  resulted  in  a  fundamental  reconceptualization  of  mod- 

^Reengineering  includes  reverse  engineering,  program  restmcturing,  soft¬ 
ware  understanding  and  innslaling  legacy  software  into  modem  program¬ 
ming  languages.  Reverse  engineering  is  the  recovery  of  program  design  in¬ 
formation  from  its  code.  Reengineering  not  only  recovers  d^gn 
infomuuion  from  existing  software,  but  may  involve  program  restructuring. 
In  program  restructuring,  the  software  engineering  uses  reverse  engineer^ 
derign  information  to  alter  or  reconstitute  the  existing  system  in  an  effort  to 
improve  its  overall  quality.  In  most  cases,  reengineered  software  reimple¬ 
ments  the  function  of  the  existing  system.  But  at  the  same  time,  the  software 
developer  also  a&Js  new  function  and/or  improves  overall  performance 
(Pressman,  ;992).  Program  restructuring  requires  software  understanding. 
To  achieve  software  understanding,  programs  need  to  he  understood  from  a 
number  of  viewpoints,  e.g.  scope  of  variables,  concurrent  structure,  sequen¬ 
tial  and  temporal  execution  behavior.  This  reduces  the  intellectual  difficulty 
of  conceptualizing  large,  complex  software  systems  (Brown  et  al..  1992) 
Re-engineering  also  comprehends  conversion  of  legacy  software  into 
modem  programming  languages,  such  as  Ada  and  C++.  (I^ssman,  1992). 
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em  software  practices  and  the  capabilities  required  for  their 
support.  Re-engineering,  i.e.  maintenance  processes,  are 
seen  to  be  present  in  domain  and  application  engineering 
and  in  forward  engineering’s  iterative  phases.  Although  the 
reasons  for  initiating  these  activities  differ,  they  employ 
identical  technical  processes.  According  to  BPR,  there  is  no 
justification  for  separating  software  development  from 
maintenance  with  respect  to  soltware  engineering  jobs  and 
organizational  structures,  management  and  measurement 
systems,  and  values  and  beliefs. 

After  fundamental  rethinking  comes  radical  redesign, 
where  the  third  BPR  principle,  use  information  technology 
as  an  enabler,  is  applied.  Section  4  describes  the  capabili¬ 
ties  of  a  new  generation  of  CASE  technology  and  the  three 
advanced  prototype  environments  that  implement  these  ca¬ 
pabilities:  the  Software  Specification  Environment,  the 
Software  Engineering  Environment,  and  the  Software  Re¬ 
engineering  Environment.  Section  5  illustrates  the  re-  en¬ 
gineered  processes  and  Section  6  summarizes  and  con¬ 
cludes  the  article. 

2.  Examine  Assumptions  for  Errors, 
Obsolescence,  or  Irrelevance 

Assumption  1:  Maintenance  starts  after  software  release 

In  the  software  engineering  literature,  several  authors 
have  criticized  the  prevailing  assumption  about  software 
maintenance  because  it  portrays  software  practice  in  an  un¬ 
realistic  manner  (Paul  and  Simon,  1989).  This  assumption, 
which  originated  with  the  Waterfall  model  (Pressman, 
1992),  views  mmntenance  as  a  separate  software  life  cycle 
that  begins  after  software  enters  production.  That  is,  before 
the  first  release  of  the  software,  all  software  activities  are 
assumed  to  be  associated  with  forward  engineering. 

Assumption  2:  Maintenance  is  defined  by  administrative 
concerns 

In  assumption  2,  technical  similarities  between  devel¬ 
opment  and  maintenance  activities  are  recognized,  but  the 
separation  is  justified  based  on  administrative  concerns. 
Both  new  software  development  and  perfective,  adaptive, 
and  preventive  mmntenance  (described  in  Section  3)  in¬ 
volve  requirements  analysis,  specification,  design,  imple¬ 
mentation,  and  testing.  Nevertheless,  the  difference  in  how 
these  activities  are  constrained  and  controlled  justifies  their 
separation. 

New  software  development  is  constrained  by  custom¬ 
er  requirements,  as  well  as  by  their  budgets,  existing  re¬ 
sources,  and  acquisition  policies.  Additionally,  software 
maintenance  is  constrained  by  a  preexisting  software  sys¬ 


tem  (Ford  and  Gibbs,  1989).  While  new  software  develop¬ 
ment  control  is  mandated  by  distinct  developmental  phases 
defined  in  military  standards,  (c.g.  DoD-STD-2167A, 
1988),  software  maintenance  is  problematic  at  best. 

For  example,  software  developers  know  that  change 
occurs  from  the  earliest  design  stages  as  initial  expressions 
of  customer  requirements  are  refined.  After  in^plementa- 
tion,  controlling  changing  requirements  involves  control¬ 
ling  ongoing  iterations  that  mix  old  code  (typically  with  in¬ 
adequate  documentation  of  original  specifications  as  well 
as  modifications  made  over  time),  new  programmers,  and 
new  technology.  The  control  process  is  ad  hoc  and  the  prob¬ 
lem  grows  over  time:  the  larger  the  installed  base  of  legacy 
code,  the  more  formidable  the  problems. 

Assumption  3;  Forward  CASE  technology  can  perform  al¬ 
most  all  maintenance 

Under  this  assumption,  maintenance  is  understood  to 
involve  not  only  fixing  bugs,  but  also  responding  to  changes 
in  customer  requirements,  to  new  technology,  to  changes  in 
the  external  environment,  e.g.  regulatory  changes.  Thus, 
maintenance  is  viewed  as  a  component  of  the  forward  de¬ 
velopment  process,  involving  modifications  to  require¬ 
ments  specifications,  designs,  and  source  code,  testing,  and 
performing  configuration  management.  Most  mainte¬ 
nance,  therefore,  can  be  performed  with  the  same  tech¬ 
niques  and  products  used  during  forward  software  develop¬ 
ment,  but  some  more  specific  tasks  must  be  approached 
with  ad  hoc  techniques  and  tools  such  as  reverse  engineer¬ 
ing  (Fuggetta,  1993).  Since  technology  now  permits  code 
to  be  generated  directly  from  design  specifications,  most 
’’maintenance”  activities  can  be  accomplished  through  for¬ 
ward  redesign  and  code  generation. 

Discussion  of  maintenance  assumptions 

This  section  discusses  the  consequences  of  the  three 
assumptions.  Assumption  1  seems  to  have  legitimized  the 
high  costs,  poor  technological  support,  and  poor  manage¬ 
ment  of  maintenance  activities  (CSTB  Report,  1990).  For 
example,  the  management  of  the  same  software  system  may 
be  split  between  a  development  manager  and  a  maintenance 
manager.  Similarly,  the  development  and  maintenance 
tasks  are  frequently  allocated  to  different  developmental 
and  maintenance  software  engineers.  In  general,  mainte¬ 
nance  work  is  perceived  as  less  desirable  than  new  software 
development,  and  few  programmers  aspire  to  a  career  in 
maintenance. 

BPR  views  this  condition  as  a  symptom  of  the  indus¬ 
trial  organization  structure  introduced  by  Adam  Smith. 
This  structure  is  characterized  by  the  specialization  of 
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work,  which  is  broken  down  into  simple  tasks  and  assigned 
to  separate  oiganizational  units.  When  work  is  reengine¬ 
ered  into  processes,  several  jobs  that  cut  across  organiza¬ 
tional  boundaries  are  combined  into  one,  and  generalists 
performing  multidimensional  work  replace  specialists 
(Hammer  and  Champy,  1993). 

Rather  than  dividing  software  into  specialized  devel¬ 
opment  and  maintenance  activities,  we  should  instead  con¬ 
ceptualize  software  as  a  process  that  originates  with  cus¬ 
tomer  need  and  ends  with  software  that  meets  that  need. 
Coiiceptualizing  all  software  as  a  process  will  change  the 
structure  of  software  engineering  jobs,  the  associated  man¬ 
agement  and  measurement  systems,  and  people’s  values 
and  beliefs  about  the  significance  of  their  work. 

According  to  BPR,  Assumption  2  errs  by  organizing 
processes  around  administrative  concerns  rather  than 
around  the  work  itself.  This  leads  to  a  process  that  is  severe¬ 
ly  ftagmented  iniO  specialized  jobs  and  administrative  and 
control  procedures.  Each  separate  task  requires  a  "handoff* 
to  the  next  task.  Each  handoff  in  turn  requires  administra¬ 
tive  coordination  or  control,  adding  to  process  overhead. 
The  integrated  tools  described  in  Section  4  will  eliminate 
the  necessity  of  both  administrative  and  technical  handoffs. 
Additionally,  assumption  2  is  weak  because  once  new  soft¬ 
ware  enters  testing,  a  software  system  exists. 

Assumption  3  suffers  ftom  a  common  failure  of  BPR 
efforts,  that  of  deftning  a  process  too  narrowly  (Hall  et  al., 
1993).  For  example,  bugs  that  appear  during  testing  or  after 
the  software  is  released  may  be  at  too  low  a  level  than  that 
provided  by  program  design  specifications.  Hence  it  is  nec¬ 
essary  sometimes  to  understand  software  at  the  level  of 
code.  Ideally,  the  software  engineer  could  reverse  engineer 
ftom  the  code  to  a  low  level  abstraction,  correct  or  restruc¬ 
ture  the  abstraction,  and  then  forward  engineer  new  code. 
However,  assumption  2  does  not  propose  the  integration  of 
tools  for  forward  and  reverse  engineering  (Fuggetta,  1993). 

These  assumptions  may  have  influenced  developers 
of  CASE  technology  to  neglect  tools  that  support  mainte¬ 
nance  processes  in  favor  of  tools  that  support  forward  soft¬ 
ware  development  processes  (Chen  and  Norman,  1992). 
Consequently,  the  current  generation  of  CASE  technology 
does  not  support  software  engineering  activities  present  in 
the  entire  life  cycle. 

To  support  the  entire  life  cycle,  integrated  tools  are 
needed  having  capabilities  not  only  for  software  specifica¬ 
tion,  design,  code  generation  and  testing,  but  also  for  re¬ 
verse  engineering,  re-engineering  legacy  software,  and 
evaluating  components  in  a  reuse  library.  Modem  software 


practice  thus  requires  a  new  generation  of  integrated  CASE 
technology  that  supports  the  complete  software  life  cycle 
process. 

Accomplishing  this  objective  requires  coordinated 
managerial  and  technological  support.  Integrated  tools  that 
support  both  software  development  and  maintenance  acti¬ 
vities  could  provide  the  technological  foundation,  but  im¬ 
proving  administrative  controls  can  come  only  from  stan¬ 
dards  that  recognize  the  reality  of  change  in  the  software  life 
cycle. 

SDD  -  A  step  in  the  right  direction 

The  new  DoD  draft  standard  for  software  develop¬ 
ment,  Software  Design  Document  (SDD),  is  a  step  in  this 
direction  (DoD-STD-SDD,  1992),  SDD,  in  addition  to  en¬ 
couraging  software  reuse,  permits  the  2167A  life  cycle 
model  to  be  supplemented  or  replaced  with  alternative 
models  that  explicitly  recognize  change  as  inherent  in  large 
systems.  Thus,  both  mission  critical  and  information  sys¬ 
tems  development  under  SDD  could  follow  the  spiral 
(Boehm.  1988),  the  evolutionary  (Gilb,  1988),  or  the  proto¬ 
typing  model  (Boar,  1984).  These  alternative  models  are 
realistic  because  they  recognize  the  necessity  of  iteration 
within  and  between  life  cycle  phases. 

The  evolving  SDD  standard  provides  an  excellent  op¬ 
portunity  for  the  software  engineering  community  to  recon¬ 
ceptualize  the  role  of  maintenance  acJvities  in  the  software 
life  cycle.  Once  it  is  recognized  that  maintenance  activities 
include  an  intensive  mix  of  forward  and  re-engineering 
processes,  integrated  tools  and  environments  can  be  devel¬ 
oped  that  reduce  the  time  needed  to  transition  among  these 
activities.  Administrative  controls  linked  to  technology  use 
can  then  be  introduced  to  monitor  and  control  maintenance 
costs. 

3.  Define  Processes 

This  section  identifies  the  capabilities  required  to  sup¬ 
port  maintenance  processes  in  the  software  life  cycle. 
Maintenance  and  new  software  development  are  shown  to 
require  many  common  capabilities,  indicating  the  need  for 
integrated  toolsets  and  environments.  Maintenance  pro¬ 
cesses  have  been  studied  by  the  Nadonal  Institute  of  Stan¬ 
dards  and  Technology  (NIST)  (CSTB  Report,  1990).  A  de¬ 
scription  of  each  maintenance  process  and  the  percentage  of 
time  devoted  to  each  follows; 

Perfective  maintenance,  or  enhancements  (50%  or  more) 
introduces  major  transformations  in  form,  functions,  anc 
objectives. 
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Adaptive  maintenance,  (25%)  responds  to  changes  in  the 
external  environment,  including  conversion  of  legacy  code 
ir*o  modem  programming  languages. 


Corrective  maintenance,  (20%)  includes  diagnosis  and 
correction  of  design,  logic  or  coding  errors. 

Preventive  maintenance,  (5%)  improves  future  maintain¬ 
ability  and  reliability. 

Along  the  horizontal  axis  of  Table  1  are  listed  the 
midntenance  processes  and  the  percentage  of  time  devoted 
to  each.  Along  the  vertical  axis  are  listed  the  capabilities 
required  by  a  software  engineer  from  CASE  tools  in  order  to 
perform  these  processes  effectively  and  efficiently.  Note 
that  some  capabilities,  such  as  testing,  understanding  the 
existing  system’s  software  architecture  and  interfaces,  and 
configuration  management,  apply  to  multiple  processes. 


Software  engineers  spend  most  of  their  time  thinking 
about  software.  A  perusal  of  these  capability  requirements 
suggests  that  software  engineers  could  devote  less  time  to 
maintenance  processes  if  tools  were  available  for  facilitat¬ 
ing  the  understanding  of  complex  software  relationships 
and  the  performance  of  thought-intensive  cognitive  tasks. 
For  example,  tools  for  software  understanding  could  reduce 
the  time  needed  for  all  maintenance  processes.  Tools  for 
creating  and  updating  requirements  specifications  could  re¬ 
duce  the  time  devoted  to  perfective  and  adaptive  mainte¬ 
nance.  Tools  for  automating  the  translation  of  legacy  soft¬ 
ware  into  modem  programming  languages  could 
significantly  impact  adaptive  maintenance. 


4.  Use  Information  Technology  as  an  Enabler 

BPR  success  depends  upon  information  technology. 

Integrated  databases,  networks,  computer-supported  coop¬ 
erative  work  environments,  client-server  architectures  and 
expert  consultative  systems  provide  BPR  infrastructure. 

Maintenance  process  reengineering  requires  a  CASE 
technology  inftastnicture.  This  section  proposes  such  an  in¬ 
frastructure  based  on  the  research  and  development  work  at 
Computer  Command  and  Control  Company  (CCCC). 

CCCC  has  been  engaged  in  developing  software  reengi¬ 
neering  and  specification  technology  as  part  of  the  research 
and  development  programs  at  the  Navy  Surface  Warfare  Table  1 :  Maintenance  Process/Required 

Center  (NSWC)  and  the  Joint  Logistics  Commanders  Capabilities  Matrix. 


Maintenance  Process  | 

Required  CapabiliUes 

Per¬ 

fective 

(50%) 

Adap¬ 

tive 

(15%) 

Cor¬ 

rective 

(20%) 

Preven- 

Uve 

(5%) 

Generate  cunent  software 
abstractions  (specifications) 
from  implementeo  software. 

X 

X 

X 

X 

Generate/revise/confinn 
software  specifications  from 
a  specification  reuse 
repository. 

X 

Reengineet/update  a  domain 
from  specificatiOD. 

X 

X 

X 

X 

Analyze  commonality/ 
variability  in  the  domain  to 
engineer  the  application. 

X 

X 

Query,  retrieve,  analyze,  and 
understand  reuse  repository 
of  code  specifications. 

X 

Generate  code  from 
specifications. 

X 

X 

X 

X 

Perform  software 
concurrency  analysis. 

X 

X 

X 

X 

Perform  software 
simulation. 

X 

X 

X 

X 

Testing. 

X 

X 

X 

X 

Configuration  management. 

X 

X 

X 

X 

Convert  from  legacy  lang¬ 
uage  to  modem  language. 

X 

X 

Understand  converted 
legacy  software. 

X 

X 

Restructure  system  bound¬ 
aries  to  support  partial 
retirement  of  legacy  sys¬ 
tems. 

X 

X 

Reorganize/restructure 
converted  legacy  code  into 
new  progranuning  para¬ 
digms,  e.g.  object-oriented. 

X 

X 

Extract  components  from 
legacy  systems  for  reuse 
bbraries  and/or  new  sys¬ 
tems. 

X 

X 

Migrate  to  new  development 
and  production  environ¬ 
ments,  e.g.  open  systems. 

X 

Understand  code  visually, 

(i.e.  graphically)  from 
different  perspectives. 

X 

X 

X 

X 

Manipulate  visual  code 
representations. 

X 

X 

X 

X 

Generate  new  code  from 
visual  representations. 

X 

X 

X 

X 
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(JLC).  These  types  of  tools  are  being  integrated  with  the 
forward  softwareengineering  tools  of  the  Domain  Specific 
Software  Architecture  (DSSA)  (Mettala  and  Graham, 
1992)  and  with  the  ARPA/STARS  Software  Engineering 
Environment  (SEE)  (Foreman,  1992). 

Figure  1  shows  how  these  environments  are  integrated 
to  support  the  complete  software  life  cycle  processes. 


ments  of  retrieved  answers  can  be  extracted  for  inclusion  as 
updates  to  relevant  new  documents.  SSE  leads  the  inexperi¬ 
enced  specifier  in  a  ”step>-by-step”  manner  and  provides 
traceability  to  the  source  documents  used  to  update  specifi¬ 
cations.  SSE  has  been  used  in  a  demonstration  project  for 
the  Tactical  Air  Mission  Planning  System  (TAMPS)  pro¬ 
gram  at  the  Naval  Air  Warfare  Center,  Warminster,  PA. 


Software  Specification  Environment  (SSE) 

The  SSE  appears  at  the  top  left  of  Figure  1.  SSE  facili¬ 
tates  the  creation  and  updating  of  software  specifications 
conforming  to  DoD  Standard  2167A  (DoI>-STD-2I67A. 
1988).  Subsystems  store,  compose,  and  update  Data  Item 
Description  (DID)  documents,  including  the  System/Seg- 
ment  Specification,  the  System/Segment  Design  Docu¬ 
ment,  the  System  Requirement  Specification,  and  the  Inter¬ 
face  Requirements  Specification. 

The  inclusion  of  SSE  reflects  the  importance  of  soft¬ 
ware  specifications  for  an  orderly  software  life  cycle.  SSE  is 
an  integrated  set  of  information  repositories  and  tools.  SSE 
guides,  instructs  and  informs  staff  in  composing,  updating 
and  evaluating  preliminary  requirements  and  specifica¬ 
tions.  Typical  users  of  SSE  are  Development  Managers, 
Software  Support  Activities,  or  Contractors.  SSE  allows  a 
user  to  manage,  query  and  update  its  application  system  re¬ 
pository.  Staff  may  ask  complex  technical  questions  about 


SSE  subsystems  include- 

•  Document  Manager;  This  is  used  by  the  data  adminis¬ 
trator  to  create  and  catalogue  documents  in  the  reposi¬ 
tory 

•  Assignment  Manager.  This  is  used  by  the  manager  to 
enter  the  work  plan  for  subordinates  who  compose  or 
update  documents 

•  Step-by-step:  This  is  used  to  guide  specifiers  in 
searching  previous  documents  and  composing  or  up¬ 
dating  requirements  and  DlDs 

•  Evaluate;  This  is  used  to  provide  feedback  on  com¬ 
pleteness  of  DID  coverage 

These  subsystems  are  supported  with  conunercial  off-the- 

shelf  software  (COTS): 

•  Document  loading  and  publishing  -  Interleaf 

•  Editor -MS  WORD  and  WordPerfect 

•  Search  -  Zyindex 
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Software  Engineering  Environment  (SEE) 

The  SEE  (Foreman,  1992)  is  shown  at  the  top  right  of 
Figure  1.  SEE  incorporates  new  software  development 
technology  for  software  reuse  and  for  automatic  program 
generation,  following  ARPA’s  Domain  Specific  Software 
Architecture  (DSSA)  Program  (Mettala  and  Graham, 
1992).  The  AREA  STARS  SEE  for  NAVAIR  PMA  205  in¬ 
cludes  Domain  Engineering  (DE),  which  enables  a  domain 
engineer  to  define  the  process  of  producing  software  for  a 
class  of  related  applications  in  a  domain,  and  Application 
Engineering  (AE),  which  enables  an  application  engineer  to 
produce  software  for  an  application  that  belongs  to  the  do¬ 
main  (SPC,  1992).  The  Domain  Engineering  and  Applica¬ 
tion  Engineering  (DE/AE)  facilities  are  language  indepen¬ 
dent 

DE/AE  facilities  are  employed  as  follows.  A  specific 
domain  is  comprised  of  software  for  a  closely  related  family 
of  applications.  Once  a  domain  software  architecture 
(Agrawala,  1992)  is  developed,  applications  can  be  gener¬ 
ated.  Domain  Engineering  contains  a  repository  of  reuse 
software  artifacts  and  associated  tools.  The  creator  of  a  do¬ 
main  architecture  is  called  the  Domain  Engineer.  The  DE/ 
AE  facilitates  selection  of  reuse  software  and  generation  of 
software  to  create  a  specific  application  system.  The  user  of 
the  repository  and  of  the  tools  is  called  the  Application  En¬ 
gineer  (SPC,  1992). 

The  reuse  software  is  part  of  the  DE/AE  repository 
(see  Figure  1).  The  reuse  software  is  organized  as  a  hierar¬ 
chy  of  software  artifacts  that  follow  the  domain  architec¬ 
ture.  For  example,  DoD  software  follows  a  standard  hierar¬ 
chical  tree  structure  of  software  units  called  System, 
Segment,  Computer  Software  Configuration  Item  (CSCI) 
and  Software  Unit  (DoD-STD-2 167 A).  Each  software  unit 
in  the  luerarchy  has  a  specification  of  its  position  in  the  ar¬ 
chitecture  hierarchy,  its  capabilities,  interfaces  and  depen¬ 
dencies  on  other  software  units.  Six  types  of  Software  Ab¬ 
straction  Documents  are  created  to  document  the 
architecture.  These  include: 

•  Hierarchical  decomposition  diagram:  This  graph 
shows  the  decomposition  of  the  overall  software  into 
hierarchical  units 

•  Flow  diagram:  This  graph  shows  the  flow  of  data  and 
control  within  and  between  hierarchical  units 

•  Interface  table:  These  tables  show  the  structure  of  in¬ 
puts  and  outputs  of  each  hierarchical  unit 

•  Object/Use  diagram:  This  graph  shows,  for  each  hier¬ 
archical  unit,  where  types  or  generics  are  defined  and 
where  they  are  used 


•  Context  diagram:  This  graph  shows  the  library  units 
and  where  they  are  used 

•  Comments  text:  This  text  records  the  software  com¬ 
ments  found  in  each  hierarchical  unit.  They  are  as¬ 
sumed  to  contain  information  about  the  hierarchical 
unit’s  capabilitities 

The  capabilities  of  the  hierarchical  software  units  de¬ 
termine  commonality  and  variability  among  them.  It  is  pos¬ 
sible  to  navigate  through  the  domain  hierarchy  tree  by  refer¬ 
ring  to  capabilities  and  selecting  hierarchical  software  units 
based  on  commonality  and  variability  of  their  respective  ca¬ 
pabilities.  Hierarchical  software  units  may  be  parameter¬ 
ized  and  a  code  generation  tool  may  be  used  to  select  param¬ 
eters  of  generic  software.  Alternatively,  hierarchical  units 
may  be  completely  generated  based  on  models  of  their  func¬ 
tionality.  A  series  of  tools  is  also  available  in  the  DE/AE  for 
application  modelling,  unit  testing,  and  conversion  to  con¬ 
current  operations. 

The  SEE  also  contains  PTECH  (PTECH,  1992). 
PTECH  permits  a  software  engineer  to  design  object-ori¬ 
ented  software  and  to  generate  object-oriented  programs  in 
Ada  and  in  C-H-  automatically. 

Finally,  the  SEE  incorporates  meta  tools  for  tool  in¬ 
tegration. 

Software  Reengineering  Environment  (SRE) 

The  SRE  (CCCC,  1992)  is  shown  at  the  bottom  left  of 
Figure  1.  The  SRE  incorporates  the  technologies  of  soft¬ 
ware  visualization  and  visual  programming.  Software  visu¬ 
alization  overcomes  the  essential  invisibility  (i.e.  non¬ 
physical  quality)  of  software  by  representing  the  program 
structure,  control  flow,  and  data  graphically.  An  abstract, 
graphical  representation  can  facilitate  a  software  engineer’s 
visual  perception  and  cognitive  understanding  of  complex 
software  during  debugging,  monitoring,  and  especially, 
program  restructuring  (Roman  and  Cox,  1993;  Meyers, 
1990). 

Visual  programming  is  a  methodology  of  program¬ 
ming  as  well  as  of  program  maintenance.  In  visual  pro¬ 
gramming,  a  graph  is  composed  and  edited  on  the  screen  of 
a  terminal,  primarily  through  use  of  a  pointing  device.  This 
is  contrasted  with  conventional  textual  programming  of 
keying-in  textual  statements. 

To  date,  software  visualization  and  visual  program¬ 
ming  technologies  have  been  developed  for  forward  soft¬ 
ware  engineering,  Dcilitiating  software  design  to  be  fol¬ 
lowed  by  implementation.  The  design  diagrams  in  these 
systems  also  produce  software  code,  partially  automatically 
and  partially  manually.  In  contrast,  the  SRE  derives  design 
diagrams  automatically  from  software  code.  In  this  way. 
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maintenance  can  be  performed  on  the  reverse  engineered 
design  which  is  consistent  with  the  old  code,  and  the  entire 
new  code  can  be  produced  automatically. 

The  SRE  has  three  main  capabilities; 

(i)  Software  understanding: 

Software  Understanding  consists  of  query  and  retriev¬ 
al  of  graphic  diagrams  that  illustrate  the  software  from  vari¬ 
ous  perspectives.  These  diagrams  are  used  to  visualize  spe¬ 
cific  aspects  of  the  software.  The  diagrams  are  first  divided 
into  in-  the-large  and  in-the-small  diagrams.  In-the-large 
diagrams  help  a  software  engineer  to  visualize  declarations 
of  objects.  In-the-small  diagrams  help  a  software  engineer 
to  visualize  execution  statements  within  individual  pro¬ 
gram  units. 

Ada  program  diagrams  are  stored  in  a  graphic  form  in 
the  repository  of  a  customized  CASE  system.  A  graphic 
query  language  is  provided  for  ad-hoc  browsing  of  the  Soft¬ 
ware  Abstraction  Documents  in  the  graphic  repository. 
These  gnqihs  show  relations  between  high  or  low  level  hier¬ 
archical  units.  This  facilitates  the  understanding  of  the  soft¬ 
ware’s  architecture  as  well  as  its  detailed  code.  Changes  to 
the  program  for  debugging  or  program  restructuring  can  be 
made  via  the  gnqihics  used  for  visualization. 

Examples  of  graphic  query  retrieval  capabilities  in 
the  SRE  for  understanding  and  creating  reusable  software 
include; 

•  Display  Base  View 

•  Query  Base  View  to  create  Subview 

•  Select  root  node,  e.g.  "within  Package  x...” 

•  Select  node  type,  e.g.  "show  me  all  of  the  proce¬ 
dures..." 

•  Select  relations,  e.g.  "and  their  Input/Output” 

•  Select  depth,  e.g.  "any  children...?’ 

•  Query  subviews  as  needed  for  progressive  graphical 
browsing 

•  Save  subviews  as  needed  for  documentation 

•  Generate  Ada  crxle  from  any  view 

(ii)  Software  abstraction  and  documentation 

SRE  partitions  the  software  into  multi-level  hierar¬ 
chical  software  units  conforming  to  the  2 167 A  standards  for 
describing  the  software  architecture.  Software  Abstraction 
Documents  are  then  generated  that  describe  the  architecture 
of  these  units  from  different  perspectives. 


(iii)  Software  capture  and  its  transformation  to 
Ada 

The  SRE  translates  CMS  11  code,  statement  by  state¬ 
ment,  into  a  pseudo-Ada,  called  Elementary  Statement 
Language  (ESL)  Ada.  Next,  the  ESL-Ada  programs  are 
Uansformed  into  the  Ada  programming  paradigm  in  a  series 
of  passes  that  achieve  100%  translation  to  Ada.  Each  pass 
translates  different  aspects  of  the  programming  paradigm  of 
the  source  language  into  the  Ada  programming  paradigm 
(e.g.,  object  declarations  and  execution  statements).  During 
the  transformation  process,  seven  different  sets  of  relations, 
each  defining  a  different  view  of  the  associations  among 
programming  objects,  are  generated.  At  the  end  of  the 
transformation  process,  these  relations  are  converted  into 
graphic  structures  in  the  form  of  Entity-Relation-Attribute 
(ERA)  diagrams.  The  ERA  representation  is  the  main  ve¬ 
hicle  for  graphic  program  analysis  and  visualization.  Visu¬ 
alization  is  used  for  query,  retrieval,  understanding,  restruc¬ 
turing  and  generating  the  documentation  of  programs. 

The  integration  of  the  SSE,  SRE,  DE/AE  and  PTECH 
is  accomplished  through  two  interfaces,  also  shown  in  Fig¬ 
ure  1.  Their  descriptions  follow: 

Interface  between  SRE  and  SSE: 

This  interface  is  shown  at  the  middle  left  of  Figure  1 
(Prywes  et  al.,  1993).  This  interface  provides  a  reverse  pro¬ 
cess  to  produce  information  for  the  software  requirements 
and  specifications  and  other  documentation  from  program 
code.  SSE  receives  the  documentation  from  SRE.  The  Soft¬ 
ware  Abstraction  E>ocuments  map  into  specific  paragraphs 
of  the  2167A/SDD  specifications,  as  shown  in  Table  2.  For 
example.  Table  2  shows  that  the  Hierarchy  graph  can  be 
used  for  the  System  Architecture  diagram  required  in  para¬ 
graphs  3.1  and  3.2.3  of  the  System/Segment  Specification 
of2167AandSDD. 

Interface  between  SRE  and  SEE 

This  interface  is  shown  in  the  left-hand  portion  of  Fig¬ 
ure  1  (Prywes  and  Lee,  1993).  The  SRE  provides  DE/AE 
and  PTECH  with  software  documentation  in  the  form  of 
high-level  graphic  views  of  the  architecture  as  well  as  de¬ 
tailed  graphic  views  of  algorithms.  The  SRE  can  process 
legacy  code  as  well  as  reuse  code  from  the  DE/AE  reposito¬ 
ry.  The  SRE  generates  key  parts  of  the  specifications  of 
each  hierarchical  software  unit  (Table  2).  The  capabilities 
of  each  hierarchical  software  unit  in  the  specifications  are 
employed  to  establish  commonality  and  variability  among 
the  domain  architecture’s  hierarchical  software  units. 
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Tools  for  the  SSE  and  SRE  environments  and  their  in¬ 
terfaces  were  developed  by  CCCC. 
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Table  2:  Mapping  Software  Abstraction 
Documents  into  Software  Specifications. 


The  required  maintenance  capabilities  listed  in  Table 
1  are  now  mq;>ped  into  their  respective  environments  in 
Table! 

5.  Dlustrations  of  Re-Engineered  Life  Cycle 
Processes 

Several  scenarios  illustrating  iterative  use  of  the  tools 
shown  in  Figure  1  are  possible.  The  scenarios  depend  on  the 
histoiy  of  software  development,  previous  life  cycles  and 
maintenance  upgrades.  Typically,  the  tools  will  be  used  it¬ 
eratively  until  a  desired  new  or  maintained  application  sys¬ 
tem  is  obtained.  The  following  two  scenarios  illustrate  the 
need  for  integrated  forward,  reverse,  and  reengineering  ca¬ 
pabilities  throughout  the  software  life  cycle.  The  first  sce¬ 
nario  illustrates  how  the  integrated  capabilities  facilitate 
software  development  with  DE/AE  reuse  repositories.  The 
second  illustrates  how  the  integrated  capabilities  facilitate 
the  re-creation  of  software  specifications. 

Facilitating  software  development  with  domain 
and  application  engineering  reuse  repositories 

Assume  that  totally  new  application  software  is  de¬ 
sired.  The  preliminary  requirements  have  been  generated 
by  the  application’s  Program  Manager.  The  platform  to  be 
used  and  its  dynamics  may  be  derived  through  modelling 


and  simulation.  The  Specifier,  with  the  aid  of  SSE,  uses  the 
requirements  to  compose  hierarchically  structured  specifi¬ 
cations.  The  capabilities  are  then  communicated  to  the  Ap¬ 
plication  Engineer  who  uses  the  specifications  to  select  ar¬ 
chitecture  units  from  the  domain  and  to  generate  new  code 
to  create  application  software.  If  unable  to  do  so,  the  Do¬ 
main  Engineer  may  be  called  to  expand  the  scope  of  the  do¬ 
main.  Expanding  the  scope  of  a  domain  requires  under¬ 
standing  the  impact  of  commonality  and  variability  in 
capabilities  of  each  architecture  unit  and  its  code.  In  either 
case,  the  SRE  tool  will  be  used  to  document  and  display  the 
new  domain  and/or  application  software.  Software  Ab¬ 
stractions  will  be  reverse  engineered  from  the  code,  giving 
the  architecture  of  the  application  software.  The  Software 
Abstractions  are  next  used  by  the  Specifier,  vdio  employs 
SSE  to  update  the  specifications.  The  Domain  Engineer 
will  use  the  Software  Abstractions  to  verify  and  update  the 
domain  software.  The  Application  Engineer  will  use  the  re¬ 
verse  engineered  software  abstractions  to  document  the  ap¬ 
plication  software.  This  cycle  may  be  repeated  a  number  of 
times  until  satisfactory  application  software  is  realized. 

The  SRE  can  process  software  from  the  various 
sources  (domain,  application,  legacy)  and  augment  the  do¬ 
main  architecture  to  satisfy  new  capabilities.  Visualization 
griq>hs  and  Software  Abstractions  portray  the  architecture 
of  the  system  as  well  as  Ada  code  artifacts.  The  Software 
Abstractions  are  conununicated  to  the  SSE  so  that  the  Spec¬ 
ifiers  can  incorporate  them  in  updated  specifications.  The 
abstractions  and  code  are  communicated  to  the  Domain  En¬ 
gineer  who  can  use  them  to  update  the  domain.  The  abstrac¬ 
tions  and  program  visualizations  are  conununicated  to  the 
Application  Engineer  who  can  use  them  to  create  documen¬ 
tation  of  the  implemented  application  software. 

Facilitating  re-creation  of  software  specifications 

Reverse  and  re-  engineering  capabilities  can  also  play 
a  significant  role  in  the  interim  before  domain  and  applica¬ 
tion  engineering  become  a  reality.  For  example,  the  reverse 
engineering  capability  can  be  used  to  confirm  software 
specifications  developed  by  outside  vendors.  In  the  context 
of  new  software  development,  DoD  Standard  2167A  (1988) 
prescribes  that  software  specifications  for  tiussion  critical 
software  be  partitioned  into  Computer  Software  Configura¬ 
tion  Items  (CSCIs).  CSCI  development  is  contracted  to  out¬ 
side  vendors.  Mandated  periodic  reviews  determine  wheth¬ 
er  the  code  under  development  meets  the  original 
specifications.  Program  Offices  receiving  delivery  of  new 
code,  as  part  of  a  review,  need  to  confirm  conformance  of 
code  to  specifications. 
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Table  3:  Required  Capabiiities/CASE  Technology  Matrix. 


This  confirmation  can  be  accomplished  by  reverse  en-  Newer  software  life  cycle  methods,  such  as  those  that 

gineering  the  delivered  software  and  comparing  the  soft-  include  prototyping,  joint  application  development,  and 

ware  abstractions  to  the  original  specifications.  The  earlier  evolutionary  development,  will  increase  the  frequency  with 

deviations  are  found,  the  greater  the  future  savings  in  devel-  which  the  reverse  specification  capability  will  be  needed 

opment  time  and  cost.  The  reverse  specification  capability  because  these  methods  do  not  assume  the  existence  of  a 

would  also  enable  more  objective  formal  reviews  during  complete  set  of  requirements  before  design  and  program  de¬ 
software  development  because  the  Software  Abstraction  velopment  begins.  Additionally,  changes  typically  are 

Documents  can  be  compared  to  the  original  specifications.  made  quickly  at  a  user's  request  at  the  level  of  code,  not 
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specifications.  Therefore,  Software  Abstractions  generated 
automaticaliy  from  evolutionary  or  prototype  software  can 
keep  the  specifications  current  at  very  little  cost. 

The  reverse  engineering  capability  is  also  needed  for 
the  software  maintenance  activity  of  updating  obsolete 
specifications  or  creating  specifications  for  undocumented 
software.  Software  managers  are  continually  confronted  by 
problems  associated  with  outdated  or  unavailable  software 
specifications.  Updates  to  specifications  typically  lag  up¬ 
dates  to  software.  Pressured  by  time  and  budget  constraints, 
it  is  convenient  to  make  modifications  to  the  code  and  ne¬ 
glect  the  corresponding  changes  to  the  respective  software 
specifications.  This  is  a  critical  problem  since  the  specifica¬ 
tion  plays  a  central  role  in  the  contracting,  scheduling,  plan¬ 
ning,  design,  implementation  and  post-deployment  sup¬ 
port  The  reverse  specification  capability  will  make  it 
easier  for  software  program  managers  to  update  obsolete 
specifications  and  create  specifications  where  none  existed. 

6.  Summary  and  Conclusions 

This  article  described  the  application  of  three  business 
process  reengineering  principles  to  the  problems  of  soft¬ 
ware  maintenance  within  the  software  life  cycle.  The  first 
principle,  examine  tacit  assumptions  and  rules  for  error,  ob¬ 
solescence  or  irrelevance,  revealed  that  prevailing  obsolete 
assumptions  about  software  maintenance  regard  it  either  as 
separate  and  distinct  from  software  development  activities, 
or  as  an  activity  that  can  be  accomplished  largely  through 
redesigning  software  and  generating  new  code  with  forward 
CASE  tools.  The  case  was  made  for  technology  that  inte¬ 
grates  DE/AE  for  software  reuse  and  forward  engineering, 
with  the  maintenance  activities  of  re-engineering,  i.e.  re¬ 
verse  engineering,  program  understanding  and  restructur¬ 
ing. 

The  second  principle,  define  processes,  resulted  in  a 
fundamental  reconceptualization  of  modem  software  prac¬ 
tices.  Re-engineering,  i.e.  maintenance  processes,  were 
shown  to  be  present  in  software  development,  i.e.,  domain 
and  application  engineering  for  software  reuse  and  forward 
engineering’s  iterative  phases.  Although  the  reasons  for 
initiating  these  activities  differ,  they  employ  identical  tech¬ 
nical  processes.  According  to  BPR  principles,  there  is  no 
justification  for  separating  software  development  from 
maintenance  with  respect  to  process  definition,  software 
engineering  jobs  and  organizational  structures,  manage¬ 
ment  and  measurement  systems,  and  values  and  beliefs. 

The  third  principle,  use  information  technology  as  an 
enabler,  was  employed  to  describe  the  design  of  CASE 


technology  having  capabilities  supporting  all  software  life 
cycle  processes.  These  capabilities  include: 

•  generating,  revising,  and  confirming  software  specifi¬ 
cations, 

•  generating  new  software, 

•  performing  domain  and  application  engineering  to  se¬ 
lect,  analyze  and  generate  software  from  reuse  li¬ 
braries, 

•  reengineering  legacy  software  into  modem  program¬ 
ming  languages, 

•  visualizing,  understanding  and  restructuring  software 
for  maintenance  and  quality  improvements, 

•  reverse  engineering  software  to  recapture  its  design. 

Three  advanced  prototype  environments  that  imple¬ 
ment  these  capabilities  were  described;  the  Software  Speci¬ 
fication  Environment,  the  Software  Engineering  Environ¬ 
ment,  and  the  Software  Reengineering  Environment. 

Illustrations  of  the  reengineered  processes  described 
how  the  integrated  environments  leverage  human  resources 
not  only  for  maintenance  activities,  but  also  for  the  phased 
forward  development  method  described  in  DoD- 
STD-2167A,  for  the  prototyping  and  evolutionary  develop¬ 
ment  methods  proposed  in  the  Software  Design  Document 
(DoD-STD-SDD),  and  for  software  development  based  on 
domain  reuse  libraries. 

In  conclusion,  these  integrated  environments  have  the 
potential  to  enable  the  DoD  to  realize  the  maintenance  per¬ 
formance  improvements  in  cost,  quality,  service  and  speed 
promised  by  BPR.  This  article  demonstrated  that  the  main¬ 
tenance  process  can  be  re-engineered.  However,  imple¬ 
menting  a  re-engineered  process  requires  concomitant 
changes  in  the  job  descriptions,  organizational  structures, 
management  and  measurement  systems,  and  values  and  be¬ 
liefs  of  the  organizational  participants.  Certainly,  the  Capa¬ 
bilities  Maturity  Model  deserves  close  examination  for  its 
contribution  to  the  management  and  measurements  aspects 
(Humphrey,  1989). 

Although  BPR  can  be  a  formidable  implementation 
task,  the  new  generation  of  CASE  technology  described 
above  can  provide  the  infrastructure  needed  to  begin  revers¬ 
ing  the  maintenance  crisis. 

References 

Agrawala,  A.,  et  al,  "Domain-Specific  Software  Architec¬ 
tures  for  Intelligent  Guidance,  Navigation  &  Control,” 
Proceedings  of  the  DARPA  Software  Technology  Con¬ 
ference  1992,  Lx)s  Angeles,  CA,  April  1992. 

Bloom,  P,  "CASE  Market  Analysis,”  Volpe,  Welty  and  Co., 
San  Francisco:  1990. 


272 


Boar,  B.,  Application  Prototyping,  Wiley-Interscicnce. 
1984. 

Boehm,  B.,  "A  Spiral  Model  for  Software  Development  and 
Enhancement,”  IEEE  Computer,  Vol.  21,  No.  5,  May 
1988,  pp.  61-72. 

Brown,  A,  Earl,  A,  and  McDermid,  J.,  Software  Engineer¬ 
ing  Environments:  Automated  Support  for  Software  En¬ 
gineering,  McGraw-Hill,  London,  1992. 

Computer  Command  and  Control  Company,  Technical  Re¬ 
port,  ’’Software  Intensive  Systems  Reverse  Engineer¬ 
ing”.  prepared  under  Naval  Surface  Warfare  Center  Con¬ 
tract  N60921-90-C-0298,  April  1992. 

Chen,  M.,  and  Norman,  R.,  ”A  Framework  for  Integrated 
CASE,"  IEEE  Software.  March,  1992,  pp.  18-22. 

CSTB  Report,  ”A  Research  Agenda  for  Software  Engineer¬ 
ing.”  CACM  Vol.  33,  No.  3.  March  1990,  pp.28 1-293. 

DoD-STD-2167A,  1988,  ’’Defense  System  Software  De¬ 
velopment,”  September,  1988. 

DoD-STD-SDD,  1992,  ’’Software  Design  Document,” 
Draft  December,  1992. 

Ford.  G.,  and  Gibbs,  N.  ”A  Master  of  Software  Engineering 
Curriculum,"  IEEE  Computer,  September,  1989, 
pp.59-70. 

Foreman,  J.  ’’STARS:  State  of  the  Program,”  STARS  ’92 
Conference,  1992,  pp.  20-41. 

Fuggetta,  A.,  ”A  Classification  of  CASE  Technology,” 
IEEE  Computer,  December,  1993,  pp.  25-  38. 

Gilb,  T.  Principles  of  Software  Engineering  Management, 
Addison-Wesley,  1988. 

Goldman,  SI.,  and  Roger  N.  Nagel,  "Management, 
Technology,  and  Agility:  the  Emergence  of  a  New  Era  in 
Manufacturing,”  International  Journal  Technology  Man¬ 
agement,  Vol.  8,  No.  1/2,  pp.  18-38,  1993. 

Hall,  G.,  Rosenthal,  J.,  and  Wade,  J.  ”How  to  make  reengi¬ 
neering  really  work”.  Harvard  Business  Review,  Novem- 
ber-December,  1993,  pp.  119-133. 

Hammer,  M.  &  Champy,  J.  Reengineering  the  Corporation, 
A  Manifesto  For  Business  Revolution,  HarperCollins, 
1993. 


Humphrey,  W.  Managing  the  Software  Process,  Addison 
Wesley,  Reading,  MA:  1989. 

Mettala,  E.,  and  Graham,  M.  ’’Domain  Specific  Software 
Architecture  Program,”  Proceedings  of  die  DARPA  Soft¬ 
ware  Technology  Conference  1992,  Los  Angeles,  CA, 
April,  1992. 

Meyers,  B.,  ’Taxonomies  of  Visual  Programming  and  Pro¬ 
gram  Visualization.”  J.  Visual  Languages  and  Comput¬ 
ing,  Vol.  1  No.  1, 1990,  pp.  97-123. 

Nielsen,  J.,  "Non-command  User  Interfaces,”  CACM  (36), 
No.  4,  April  1993,  pp.  83-99. 

Paul,  J.  and  Simon,  G.  ’’Bugs  in  the  system:  Problems  in  fed¬ 
eral  government  computer  software  development  and 
regulation,”  U.S.  Government  Printing  Office,  Washing¬ 
ton,  D.C.,  September,  1989. 

PTECH  Design  and  PTECH  Code,  Release  3.5,  Tool  User’s 
Guide,”  Associative  Design  Technology,  March,  1992. 

Pressman,  R.  Software  Engineering:  A  Practitioner's  Ap¬ 
proach, 'itded,McQn.viiW\\,  NY:  1992,  pp  680-683. 

Prywes,  N.,  Ingargiola,  G.,  and  Ahrens,  J.,  ’’Automatic  Re¬ 
verse  Engineering  of  Software  to  Confirm/Update  Re¬ 
quirements  Specification,”  Computer  (Command  and 
Control  Company,  Philadelphia,  PA,  19103,  June,  1993a. 

Prywes  N.,  Lee,  I.  ’Integration  of  Software  Speciftcation, 
Reuse  and  Reengineering,”  Computer  Command  and 
Control  Company,  Philadelphia,  PA,  19103,  June  1993b. 

Roman,  G.  and  Cox,  K.  ”A  Taxonomy  of  Program  Visual¬ 
ization  Systems,”  IEEE  Computer,  December,  1993,  pp. 
11-24. 

SPC  "Domain  Engineering  Guidebook,”  Technical  Report 
SPC-92019-CMC,  Software  Productivity  Consortium, 
December  1992. 

Stewart,  T.,  "Reengineering:  The  Hot  New  Managing 
Tool,”  Fortune,  pp  41-48,  August  23, 1993. 

Texas  Instruments  and  Price  Waterhouse,  "Reengineering 
for  Information  Engineering  White  Paper,”  October  5, 
1992. 


273 


A  Syntax-Directed  Tool  for  Program 
Understanding  and  Transformation* 


William  G.  Griswold  Darren  C.  Atkinson 

Department  of  Computer  Science  &  Engineering,  0114 
University  of  California,  San  Diego 
San  Diego,  CA  92093-0114  USA 


Abstract 

Software  maintenance  is  often  too  expensive.  Part 
of  the  problem  is  that  the  repeated  modifications  of  a 
software  system  degrade  its  structure,  making  it  dif¬ 
ficult  to  understand  and  modify.  Semantically-rich 
techniques  can  help  restore  the  structure  of  a  system, 
but  they  may  require  concurrency  analysis,  timing 
analysis,  or  dependence  analysis  on  pointers,  which  are 
difficult  to  implement  efficiently.  We  propose  a  fast, 
programmable  tool  that  can  perform  syntactically- 
oriented  text  processing  tasks  for  use  in  program  un¬ 
derstanding  and  transformation.  Because  our  tool  is 
syntactically-oriented,  the  tool  user  must  find  ways  to 
discover  the  required  semantic  information.  However, 
we  believe  that  syntactic  information  complemented 
by  design  and  domain  knowledge  is  often  sufficient  to 
obtain  the  needed  semantic  information.  However,  to 
do  so  may  entail  iteratively  refining  a  query  to  find 
the  right  information,  requiring  a  fast  tool.  We  take 
a  “traditional”  compilers  approach  to  the  problem  to 
provide  a  tool  with  the  flexibility  and  speed  of  UNIX 
tools  like  awk.  Early  performance  measurements  sug¬ 
gest  that  this  approach  can  produce  results  substan¬ 
tially  faster  than  previous  approaches. 


1  Introduction 

Software  is  perceived  to  be  too  expensive  relative 
to  its  quality.  Since  maintenance  is  the  dominant 
phase  in  the  program  life-cycle  [16],  substantially  re¬ 
ducing  the  cost  of  software  requires  lowering  the  cost 
of  maintenance.  A  significant  part  of  this  cost  is  due  to 
the  fact  that  as  useful  software  ages,  modifications  to 
meet  the  needs  and  demands  of  users  are  layered  upon 
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the  original  implementation.  Modifications  unantici¬ 
pated  in  the  design  sometimes  are  not  easily  integrated 
into  the  existing  implementation,  requiring  changes  to 
multiple  modules  in  the  system  to  complete  a  single 
change  [19].  As  repeated  modifications  are  made,  the 
design  and  implementation  become  increasingly  less 
understandable  until  maintenance  becomes  unaccept¬ 
ably  expensive,  and  the  only  solutions  are  to  reimple¬ 
ment  the  system  or  restructure  it  [3][2,  p.  113]. 

An  example  of  a  system  experiencing  this  problem 
is  the  Comprehensive  Health  Care  System  (CHCS), 
an  on-line  system  for  keeping  track  of  hospital  pa¬ 
tient  records,  billing,  communication,  and  prescrip¬ 
tion  ordering.  After  evolving  from  an  initial  proto¬ 
type,  it  is  now  eight  years  old  and  consists  of  approx¬ 
imately  600,000  lines  of  code.  Currently,  to  make  a 
single  enhancement  to  the  system  costs  a  minimum  of 
$100,000.’ 

Semantics-based  techniques  are  one  method  of  im¬ 
proving  the  maintainability  of  such  systems.  For  ex¬ 
ample,  a  technique  called  tool-assisted  program  re- 
strvcturing  can  help  improve  a  system’s  degraded 
structure  [10].  A  restructuring  tool  user,  based  on  the 
needs  of  the  desired  enhancements  of  the  system,  uses 
the  tool  to  transform  the  system  (while  preserving  its 
meaning)  to  a  more  appropriate  structure — in  partic¬ 
ular,  one  in  which  the  enhancements  can  be  made  as 
local  changes  to  the  system,  rather  than  as  changes 
throughout  the  system.  There  are  two  benefits  of  this 
approach.  First,  the  structure  of  the  system  is  im¬ 
proved,  allowing  changes  to  be  made  more  easily.  Sec¬ 
ond,  the  tool  performs  all  the  global  changes  required 
to  make  a  structural  change,  and  ensures  that  the 
input-output  behavior  of  the  program  is  unchanged, 
guaranteeing  no  errors  are  introduced  by  the  restruc¬ 
turing  and  freeing  the  programmer  to  concentrate  on 
choosing  the  appropriate  structure.  If  the  tool  cannot 
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perform  a  transformation  so  that  it  preserves  meaning, 
it  prohibits  the  structural  transformation  and  reports 
the  source  of  the  problem. 

Unfortunately,  the  cost  of  implementing  and  run¬ 
ning  such  tools  can  be  prohibitive.  The  restructuring 
techniques  use  global  data  flow  analysis,  which  is  at 
best  diflicult  to  implement  efliciently  and  precisely  in 
the  presence  of  pointers.  Furthermore,  in  a  system 
such  as  CHCS,  which  is  implemented  in  MUMPS  [15], 
an  execute  command  is  frequently  used  to  translate  a 
text  string  into  code  that  is  then  executed.  CHCS  also 
uses  concurrency.  Such  features  make  cheap,  precise 
data  flow  analysis  untenable.  For  example,  execute 
commands  might  invoke  any  instruction  on  any  data 
structure.  Unless  the  cost  of  implementing  such  anal¬ 
yses  can  be  amortized  over  several  systems,  the  cost 
may  not  be  warranted. 

However,  our  investigations  of  how  programmers 
and  designers  work  have  led  us  to  hypothesize  that 
there  may  be  alternative  or  complementary  ap¬ 
proaches  that  are  easier  to  implement  and  also  support 
a  variety  of  important  tasks  in  software  maintenance. 
First,  we  believe  that  much  of  the  semantic  informa¬ 
tion  that  is  expensive  to  compute,  such  as  data  flow  in¬ 
formation,  can  be  inferred  straightforwardly  from  the 
program  source  if  some  additional  information  can  be 
obtained  from  the  design  of  the  system  or  details  of  the 
application  domain.  For  example,  if  it  is  known  that 
the  execution  of  an  execute  command  in  CHCS  never 
causes  side-effects,  then  it  can  be  inferred  that  they 
have  no  impact  on  pointer  dependences.  Second,  we 
believe  programmers  and  designers  fre<iuently  want  to 
perform  tasks  related  to  restructuring — that  is,  tasks 
that  are  global  but  semantically  constrained — but  for 
which  no  specialized  tool  exists.  However,  if  the  in¬ 
formation  required  to  make  the  change  can  be  found 
inexpensively,  the  programmer  knows  how  to  system¬ 
atically  make  the  changes,  perhaps  with  the  help  of  a 
tool. 

Our  approach,  then,  is  to  design  a  syntax-directed 
program  understanding  and  transformation  tool  that 
depends  on  the  tool  user  to  employ  design  and  do¬ 
main  information  to  make  up  for  the  lack  of  detailed 
data  flow  information.  Such  a  tool  reads  in  a  script 
that  specifies  a  program  understanding  or  transforma¬ 
tion  task  to  be  performed,  reads  in  the  system  to  be 
processed,  and  then  performs  the  task  on  the  system. 
Because  the  task  can  be  specified  on  the  spot,  it  can 
use  characteristics  of  the  system  known  to  the  pro¬ 
grammers  (but  not  readily  inferretl  from  t.he  program 
text)  to  cheaply  accpiire  needed  semantic  information. 

Such  a  tool  must  be  exceptionally  fast,  easy  to  ii.se. 


and  flexible.  If  the  tool  is  not  fast,  programmers  some¬ 
times  will  not  use  the  tool  for  reasons  of  expediency, 
increasing  the  chance  of  incorrect  changes.  A  tool 
that  is  fast  will  encourage  programmers  and  design¬ 
ers  to  pose  “what-if’  queries  to  freely  explore  possible 
enhancements  and  design  alternatives.  Also,  the  is 
a  complex  relationship  between  syntactic  inform,  on 
and  semantic  information,  so  several  syntactic  queues 
may  have  to  be  performed  and  successively  refined  to 
obtain  the  desired  semantic  information.  If  it  is  not 
easy  to  specify  program  understanding  and  transfor¬ 
mation  tasks,  programmers  will  again  be  tempted  not 
to  use  the  tool.  If  the  tool  is  not  flexible,  then  it  is 
likely  to  be  outgrown,  so  more  time  and  effort  must 
be  invested  in  other  tools. 

Existing  tools,  such  as  TXL  [8]  and  REFINE  [7], 
provide  much  of  the  needed  functionality,  but  em¬ 
ploy  computational  paradigms  that  compromise  per¬ 
formance,  ease  of  use,  or  flexibility,  limiting  their  abil¬ 
ity  to  perform  quick  queries  or  extensive  analyses  on 
large  systems.  Our  hypothesis  is  that  both  flexibil¬ 
ity  and  good  performance  can  be  achieved  by  under¬ 
standing  what  programmers  and  designers  need,  and 
by  using  existing  compiler  technology.  As  Johnson  ar¬ 
gued,  the  theory  and  practice  of  compilers  have  yielded 
tools,  insights,  and  heuristics  that  significantly  sim¬ 
plify  the  construction  of  a  compiler  [13].  We  em¬ 
ploy  standard,  highly-optimized  parser  tools  such  as 
lex  [14]  and  yacc  [12],  and  build  Abstract  Syntax  Trees 
(ASTs)  in  a  low-level  language  (C)  to  achieve  good 
performance  and  compact  representation.  Because  lex 
and  yacc  generate  much  of  the  C  code,  and  the  ba¬ 
sic  technology  for  describing,  building  and  searching 
ASTs  is  well-understood,  it  is  not  a  burden  to  use  a 
low-level,  efficient  language  like  C  for  the  implemen¬ 
tation  of  such  a  tool.  Only  the  scripting  language 
provided  to  the  tool  user  needs  to  be  high-level,  and 
this  can  be  implemented  in  C  directly  on  top  of  the 
abstract  interface  to  the  AST.  The  language’s  compu¬ 
tational  model  is  inherently  imperative  so  that  it  is 
easy  to  use  by  traditional  programmers. 

Our  goal,  then,  is  to  provide  low-cost  program  un¬ 
derstanding  and  transformation  with  a  flexible,  effi¬ 
cient  tool  in  the  style  of  the  UNIX  tool  awk  [1],  except 
with  syntactic,  rather  than  lexical  processing.  The 
matching  and  transformation  features  of  our  tool’s 
scripting  language  make  straightforward  tasks  easy  to 
specify,  but  the  general-purpose  features  of  the  lan¬ 
guage  allow  performing  more  complicated  tasks  such 
as  graphics.  Because  the  task  language  of  the  tool 
is  generic,  and  the  parser  and  AST  are  adaptable,  it 
can  easily  accommodate  a  variety  of  programming  lan- 
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guages. 

Preliminary  results  indicate  that  our  tool  can  per¬ 
form  a  simple  interrogation  of  the  1  million  MUMPS 
commands  of  CHCS  in  under  10  minutes,  includ¬ 
ing  parsing,  AST  construction,  and  simple  matching. 
Other  techniques  require  orders  of  magnitude  more 
time  or  space  to  perform  the  same  task. 

In  the  following  we  examine  two  typical  scenarios 
in  which  a  program  understanding  and  transformation 
tool  can  be  expected  to  be  used.  We  then  discuss  the 
designs  of  two  existing  tools,  describe  the  design  and 
implementation  of  our  tool,  and  its  customization  to 
a  specific  programming  language. 


2  Tool  use  and  design 

We  have  identified  two  key  scenarios  in  which  a  pro¬ 
gram  understanding  and  transformation  tool  might  be 
used.  These  have  helped  identify  key  requirements 
and  have  driven  our  design  choices.  We  use  this  anal¬ 
ysis  to  examine  the  design  decisions  of  two  existing 
tools. 

2.1  Designer’s  scenario 

A  designer  may  wish  to  discover  how  various  parts 
of  the  system  are  related  in  order  to  assess  the  dif¬ 
ficulty  of  proposed  enhancements  and  to  plan  evolu¬ 
tion  of  the  system’s  structure  in  order  to  accommo¬ 
date  those  enhancements.  For  example,  the  designer 
might  wish  to  know  the  set-use  properties  of  the  sys¬ 
tem,  that  is,  which  functions  use  which  global  vari¬ 
ables.  Potential  modules  can  be  inferred  by  cluster¬ 
ing  functions  according  to  how  they  share  information 
via  variables  [21].  Using  a  syntax-directed  tool,  the 
designer  needs  to  specify  a  one-pass  traversal  over  the 
AST  to  find  the  variable  references  in  each  component, 
recording  in  a  table  all  references  to  global  variables. 
When  the  entire  program  has  been  processed,  the  tool 
would  print  out  a  two-dimensional  array,  with  global 
variables  on  the  rows  and  functions  on  the  columns. 
Each  cell  of  the  array  contains  the  number  of  sets  and 
uses  of  a  variable  by  a  particular  function.  An  en¬ 
hancement  of  this  task  might  include  a  hypertext  view 
of  the  array — built  from  a  standard  graphics  library — 
that  allows  perusing  the  program  source  by  clicking  on 
various  cells,  components,  or  variables. 

Unfortunately,  determining  aliasing  (i.e.,  with 
pointers)  of  local  variables  to  global  structures  is  a 
potentially  complicated  calculation,  as  is  inferring  the 
effects  of  execute  commands  and  concurrency.  The 


designer  may  have  to  use  domain  and  design  informa¬ 
tion  to  customize  the  search  to  inexpensively  resolve 
ambiguities.  In  fact,  the  designer  might  have  to  per¬ 
form  several  related  queries  to  derive  all  the  necessary 
information,  examining  the  last  query  and  formulat¬ 
ing  a  new  one  to  augment  the  prior  results.  At  the 
extreme,  the  designer  can  use  the  transformation  com¬ 
ponent  of  the  tool  to  modify  the  existing  system  so 
that  it  records  dynamic  control  flow  and  the  contents 
of  pointers  to  verify  hypotheses  about  the  system.  In 
particular,  extra  statements  can  be  added  to  perform 
dynamic  tracing  of  what  addresses  are  being  derefer¬ 
enced  and  what  statements  are  being  executed.  Run¬ 
ning  the  modified  system  on  the  existing  test  suites 
can  then  provide  intuition  about  how  the  system  is 
put  together. 

2.2  Programmer’s  scenario 

The  programmer’s  scenario  is  more  bottom-up. 
When  a  programmer  is  making  changes,  there  are  at 
least  three  concerns.  First,  the  written  code  must 
perform  the  right  calculation.  Second,  any  non-local 
effects  of  this  calculation  must  be  assessed.  For  ex¬ 
ample,  are  any  global  variables  being  affected  by  the 
change,  and  is  this  intended?  Third,  any  effects  on  this 
calculation  by  other  calculations  must  be  assessed.  For 
example,  are  the  actual  inputs  to  the  calculation  the 
ones  that  the  programmer  desires?  These  last  two  con¬ 
cerns  require  non-local  reasoning  about  the  program’s 
behavior;  they  can  be  thought  of  as  backward-  and 
forward-slices  [23][18]  of  the  new  code,  respectively. 
Using  a  synteix-directed  tool,  the  programmer  would 
specify  a  backward  (or  forward)  search  from  the  pro¬ 
posed  new  component,  looking  for  references  to  vari¬ 
ables  that  the  new  component  reads  (or  writes).  For 
most  languages,  the  structure  of  the  program  provides 
fairly  accurate  control  flow  information,  and  the  sym¬ 
bol  table  provides  precise  scope  information,  ensuring 
a  reasonable  dependence  analysis.  But,  as  with  the  de¬ 
signer’s  scenario,  alias  information  has  to  be  handled 
carefully.  Note  that  on  large  projects  such  as  CHCS, 
there  are  dozens  of  programmers  who  need  to  perform 
such  calculations. 

In  either  of  these  scenarios,  the  next  step  might  be 
to  use  the  tool  to  perform  non-local  program  trans¬ 
formations  to  make  structural  changes  to  help  make 
enhancements. 

Analyzing  these  scenarios  reveals  that  the  fre¬ 
quency  of  use  or  the  number  of  iterations  for  each  use 
demand  that  a  program  understanding  and  transfor¬ 
mation  tool  have  good  performance.  Otherwise,  the 
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tool  will  not  be  used,  likely  leading  to  a  more  ad  hoc 
software  maintenance  process.  Furthermore,  a  pro¬ 
gram  understanding  and  transformation  tool  must  be 
able  to  perform  quick  “what-if”  (pieries  of  the  pro¬ 
gram  and  have  an  extendable  scripting  language  that 
allows  the  tool  user  to  easily  write  and  extend  such 
queries  and  transformations.  Because  traditional  pro¬ 
grammers  are  expected  to  use  these  tools,  the  queries 
must  be  expressible  in  a  familiar  paradigm,  such  as 
an  imperative  programming  language.  Existing  tools 
meet  some,  but  not  all,  of  these  requirements. 

2.3  Existing  tool  designs 

TXL  is  a  program  transformation  tool  based  on  a 
tree  rewriting  computational  paradigm  [8].  Basically, 
the  TXL  rewriting  engine  applies  a  rewrite  rule  to  the 
parse  tree  of  a  program  until  there  are  no  more  con¬ 
structs  in  the  parse  tree  that  are  applicable  to  the 
rule.  This  paradigm,  however,  complicates  searches, 
one-pass  transformations,  or  other  tasks  outside  the 
rewrite  paradigm.  Although  there  are  hooks  to  the  un¬ 
derlying  I  nguage  for  more  general  tasks,  the  awkward 
interface  discourages  frequent  use.  TXL’s  matching 
language  uses  concrete  syntax  for  specifying  pattern 
matching.  Although  appealing,  this  approach  compli¬ 
cates  access  to  a  part  of  a  program  that  is  in  a  deeply 
nested  construct,  because  all  of  the  surrounding  text 
to  the  desired  part  must  be  specified.  It  can  be  difficult 
to  create  an  abstract  interface  to  the  program  source 
to  overcome  this  problem.  Finally,  TXL’s  custom- 
built  parser-generator  accepts  ambiguous  grammars, 
resulting  in  slow  processing  of  the  program  source. 

REFINE  is  a  more  general  matching  and  transfor¬ 
mation  tool  based  on  a  strongly-typed  first-order  set- 
theoretic  logic  [7].  Although  this  paradigm  is  elegant 
and  powerful,  it  requires  users  to  learn  a  new  computa¬ 
tional  paradigm,  and  complicates  performing  higher- 
order  operations.  According  to  estimates  provided 
by  REFINE’s  implementors,  entering  CHCS  into  RE^ 
FINE  would  require  about  8  hours. ^  Although  a  sub¬ 
system  has  to  be  reentered  only  when  it  is  changed, 
some  subsystems  in  CHCS  would  require  9  minutes 
to  reenter.  In  order  to  ensure  accurate  global  anal¬ 
yses,  sever2d  files  might  have  to  be  reentered  several 
times  a  day  as  they  evolve.  An  AST  may  is  about 
20-40  times  larger  than  the  program  text,  depending 
on  what  attributes  are  specified  for  the  AST  nodes. 

To  provide  its  many  flexible  features,  REFINE  is 
implemented  in  the  REFINE  language,  which  itself  is 

^Lawrence  Markosian,  Personal  Cotntiuiiiicatioii.  This  e.s- 
tiinate  is  based  on  REFINE  application.*!  for  FORTRAN  and 
COBOL. 


implemented  in  Common  Lisp  [22].  Although  the  re¬ 
sulting  tool  is  compiled,  Common  Lisp  is  still  slower 
and  uses  more  space  for  data  compared  to  lower-level 
compiled  languages  such  as  C.  Like  TXL,  REFINE’s 
scripting  language  uses  the  concrete  syntax  of  the  lan¬ 
guage  for  specifying  pattern  matching,  Although  RE¬ 
FINE  allows  using  abstract  syntax,  some  of  the  spe¬ 
cial  features  associated  with  concrete  syntax  match¬ 
ing  are  then  not  available.  Like  TXL,  REFINE  has 
its  own  parser-generator  based  on  Extended  Backus- 
Naur  Form  (EBNF),  although  it  handles  a  more  stan¬ 
dard  subset  of  the  context-free  languages. 

REFINE  allows  storing  of  the  ASTs  and  other  com¬ 
puted  information  as  objects  in  persistent  storage  for 
later  retrieval.  This  can  be  useful  for  managing  very 
large  structures  when  the  swap  space  of  the  machine 
is  inadequate  to  store  them.  It  also  allows  saving  data 
that  is  costly  to  recompute  (such  as  data  flow  infor¬ 
mation)  between  uses  of  the  tool.  Retrievals  from  the 
database  are  performed  on  a  per-file  b2isis,  and  retriev¬ 
ing  files  is  estimated  to  be  5  times  faster  than  enter¬ 
ing  files. ^  Space  may  be  conserved  by  a  specification 
of  what  subset  of  AST  attributes  to  preserve.  Tuning 
the  storage  size,  however,  can  be  a  complicated  task. 

Many  of  REFINE’s  problems  are  accidental  in  na¬ 
ture,  and  a  few  essential  [6].  For  example,  it  imple¬ 
ments  the  core  of  its  paradigm,  sets,  with  linked-lists, 
although  with  some  additional  effort  they  could  be 
implemented  more  efficiently  as  hash  tables.  On  the 
other  hand,  the  use  of  the  very  high-level  REFINE 
language  for  the  REFINE  tool  and  its  instantiation  to 
operate  on  a  particular  language  poses  inherent  risks 
because  of  the  complexity  of  compiling  a  high-level 
language  to  an  efficient  form.  However,  REFINE’s 
implementors  are  now  working  on  retargeting  the  REi- 
FINE  language  to  generate  code  for  a  low-level  lan¬ 
guage  rather  than  Common  Lisp.  They  are  also  be¬ 
ginning  to  use  techniques  for  improving  the  selection 
of  representations  for  high-level  data  types  [4]. 

2.4  Design  choices  for  a  new  tool 

Achieving  both  flexibility  and  good  performance, 
regardless  of  the  chosen  software  architecture,  is  dif¬ 
ficult.  Flexibility  typically  demands  the  use  of  elab¬ 
orate  constructs  that  can  be  inefficient.  Our  primary 
challenge,  then,  was  determining  exactly  where  flex¬ 
ibility  or  performance  could  be  sacrificed  in  favor  of 
the  other,  in  order  to  achieve  both  in  typical  uses  of 
the  tool.  This  goal  can  be  achieved  by  understanding 
what  the  scenarios  above  require,  and  by  using  exist- 

^Philip  Newcomb,  Personal  Communication. 
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ing  compiler  technology.  The  theory  and  practice  of 
compilers  have  yielded  tools,  insights,  and  heuristics 
that  can  significantly  simplify  the  design  and  imple¬ 
mentation  of  a  program  understanding  and  transfor¬ 
mation  tool,  because  such  a  tool  is  ((uite  similar  to  a 
compiler.  The  following  describes  the  impact  of  using 
a  compilers  paradigm  for  the  design  of  our  tool. 

Parsing:  We  found  that  lexing  and  parsing  do  not 
need  to  be  especially  flexible,  but  they  have  to  be  ef¬ 
ficient  since  processing  every  character  in  a  program 
is  I/O  and  compute  intensive.  Nearly  all  program¬ 
ming  languages  in  use  can  be  lexed  and  parsed  by 
existing,  fast  parsing  tools  such  as  lex  and  yacc,  and 
any  language  constructs  that  are  difflcult  to  process 
can  be  handled  by  incorporating  extra  semantic  ac¬ 
tions  triggered  by  these  tools.  The  declarative  input 
to  these  toots  and  programmers’  familiarity  with  them 
make  them  easier  to  use  than  other  parsing  technicpies. 
Also,  many  lexer  and  parser  descriptions  are  publicly 
available  for  commonly  used  languages  such  as  C  and 
Ada. 

ASTs:  Similarly,  we  found  that  flexibility  is  not  a 
key  concern  in  implementing  ASTs.  ASTs  have  a  reg¬ 
ular  tree  structure  that  is  analogous  to  the  language 
grammar.  This  relationship  guides  the  design  of  the 
AST,  and  the  actions  of  the  grammar’s  parsing  rules 
can  straightforwardly  construct  an  AST  for  a  program. 
To  make  the  programming  of  AST  construction  ac¬ 
tions  even  easier,  we  have  implemented  a  preproces¬ 
sor,  that  extends  the  syntax  of  yacc  using  additional 
parsing  directives.  The  new  directives  allow  for  creat¬ 
ing  new  subtrees  directly,  or  indirectly  by  invoking  a 
user-defined  action.  The  preprocessor  can  also  gener¬ 
ate  a  header  file  of  function  prototypes  and  templates 
for  the  user-defined  actions  and  for  routines  that  can 
print  the  AST. 

To  avoid  the  need  for  special  techniques  for  storing 
ASTs  off-line,  the  space  consumed  by  an  AST  needs 
to  be  minimized.  An  AST  representation  may  be  10 
times  larger  than  a  textual  representation,  even  when 
care  is  exercised  in  its  design.  (Significant  space  can 
be  saved  by  optimizing  the  representation  of  the  leaves 
of  the  AST,  but  this  can  complicate  accesses  and  up¬ 
dates  to  it.)  Consequently,  we  were  led  to  design  ASTs 
in  C,  which  ensures  compact  representation  of  struc¬ 
tures.  Also,  using  C  for  constructing  ASTs  integrates 
well  with  lex  and  yacc.  (We  had  coiisidere<l  using  an 
object-oriented  language,  such  as  C-I--1-,  but  rejected 
it  because  of  persisting  problems  with  the  technology, 
such  as  space  and  execution  overhead.  These  problems 


can  be  overcome  with  sufficient  effort,  but  this  negates 
the  benefits  of  object-orientation.)  To  minimize  the 
storage  required  for  an  AST  node,  non-intrinsic  infor¬ 
mation  is  typically  not  stored  in  the  node.  Instead, 
the  scripting  language  provides  a  relation  data  type 
for  mapping  from  AST  nodes  to  non-intrinsic  infor¬ 
mation. 

To  further  decrease  memory  requirements  and  im¬ 
prove  paging  behavior,  for  one-pass  scripts  we  incor¬ 
porated  the  ability  to  process  ASTs  as  they  are  con¬ 
structed,  rather  than  waiting  for  a  program’s  entire 
AST  to  be  constructed.  This  interleaving  allows  dis¬ 
carding  an  AST  fragment  as  soon  as  it  is  processed. 
Consequently,  memory  requirements  are  greatly  re¬ 
duced,  and  the  accompanying  paging  time,  which  can 
be  substantial,  is  eliminated.  In  some  cases  the  actual 
AST  computation  time  is  also  reduced.  Currently,  this 
feature  requires  the  script  programmer  to  specify  the 
granularity  of  processing.  In  practice  this  has  not  been 
an  onerous  responsibility. 

Matching  and  transformation  language:  The 
main  challenge  in  providing  both  flexibility  and  perfor¬ 
mance  is  designing  the  language  constructs  provided 
to  the  tool  user  for  manipulating  the  AST.  The  tool 
user  does  not  want  to  be  distracted  by  memory  man¬ 
agement,  search  strategies,  typing,  or  complex  syntax. 
Typically,  eliminating  such  distractions  via  a  high- 
level  computational  paradigm  comes  at  the  expense 
of  performance  or  generality  of  the  language.  Such  a 
paradigm  also  may  be  difficult  for  a  traditional  pro¬ 
grammer  to  use,  indicating  that  an  imperative  pro¬ 
gramming  paradigm  might  be  appropriate.  In  fact,  we 
have  found  that  many  scripting  tasks,  such  as  coding 
standards  checking  (i.e.,  syntactic  and  static  semantic 
checking)  and  flow  analysis,  are  well-known  compiler 
problems  and  have  been  shown  to  be  relatively  easy  to 
code  efficiently  in  a  traditional  imperative  language. 
Also,  a  high-level  paradigm  compromises  the  ability 
to  use  existing  C  code  for  graphics  and  other  complex 
tasks  Although  calling  from  a  high-level  to  a  low- 
level  language  is  not  always  difficult,  storing  high-level 
types  in  the  low-level  language  presents  problems,  as 
does  performing  callbacks  to  the  high-level  language. 
Callbacks  are  common  technique  in  systems  like  the 
X  window  system  [17]. 

We  believe  that  the  advantages  of  scripting  lan¬ 
guages  are  usually  limited  to  small  programs,  since 
they  emphasize  simplicity,  making  large  pi-ograms  dif¬ 
ficult  to  write.  For  large  programs,  typing,  modular 
structure,  and  good  performance  become  more  impor¬ 
tant,  but  supporting  such  needs  complicates  writing 
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smaller  programs.  Thus,  simple  programs  are  more 
easily  written  in  the  scripting  language,  hut  more  com¬ 
plex  programs  are  more  easily  written  in  C  or  C++ 
and  called  from  a  script.  This  approach  also  allows 
using  the  growing  number  of  existing  C  code  libraries. 

Our  approach,  then,  which  is  still  under  develop¬ 
ment,  is  to  create  a  simple  imperative  scripting  lan¬ 
guage  that  can  be  translated  directly  to  C.  Only  a  few 
high-level  fsKiilities  are  provided,  such  as  a  simplified 
syntax,  goal-directed  searching,  tables,  sets,  bit-sets 
(e.g.,  for  flow  analysis),  AST  printing,  and  memory 
management.  By  keeping  the  number  of  high-level 
features  small,  a  simple  translator  can  produce  effi¬ 
cient  code.  Types  of  variables  need  not  be  declared  by 
the  programmer,  but  the  translator  infers  type  decla¬ 
rations  for  a  variable  from  the  type  of  the  initializing 
value  of  the  first  of  its  definitions.  The  types  of  built-in 
functions  and  linked-in  C  functions,  which  are  known, 
also  contribute  to  the  inference  process.  Memory  man¬ 
agement  is  provided  by  a  conservative  collector  [5],  al¬ 
though  the  AST  is  treated  specially.  To  provide  an 
interactive  environment  for  using  our  tool,  the  shared 
library  features  of  many  UNIX  platforms  can  be  used 
to  implement  dynamic  linking  and  execution  of  scripts 
written  on  the  fly. 

2.5  Tool  architecture 

Because  our  tool  consists  of  many  components  of 
various  levels  of  functionality,  we  chose  a  layered  de¬ 
sign  for  our  tool.  Layering  is  useful,  since  successively 
higher  layers  can  incrementally  provide  more  expres¬ 
sive  functionality.  Since  a  higher  layer  cannot  be  more 
efficient  than  its  underlying  layers,  the  lower  layers 
must  be  especially  efficient.  The  challenge  is  to  pro¬ 
vide  successively  more  expressive  layers  without  com¬ 
promising  performance. 

The  following  are  the  layers  of  an  instance  of  our 
tool  for  a  specific  language,  listed  from  the  lowest  to 
the  highest  layer.  Note  that  the  lower  layers  are  not 
always  strictly  layers  in  that  each  also  has  the  full 
power  of  the  implementation  language  available  to  it. 
They  are  layers  in  the  sense  of  the  new  service  provided 
by  that  layer. 

1 .  AST  construction:  The  lowest  layer  of  the  tool 
is  the  AST  construction  layer.  It  consists  of  defi¬ 
nitions  for  the  basic  AST  node  types,  constructor 
functions,  and  functions  to  link  subtrees.  Symbol 
table  construction  is  also  performed  in  this  layer. 

2.  Lexer  and  parser:  The  lexer  and  parser  read  in 
the  source  files  and  call  the  AST  construction  rou¬ 


tines  to  assemble  an  AST  and  fill  the  symbol  ta¬ 
ble.  Note  that  the  layers  above  determine  how  the 
parser  is  invoked.  For  example,  the  parser  may 
be  invoked  to  assemble  an  AST  one  file,  routine 
or  statement  at  a  time.  The  lexer  and  parser  are 
different  for  each  programming  language,  and  are 
generated  from  declarative  lexical  and  syntactic 
descriptions  using  lex  and  yacc  (See  Section  2.6). 

3.  Abstract  syntax  interface:  The  abstract  inter¬ 
face  to  the  AST  consists  of  basic  access,  update, 
and  searching  functions  for  the  AST.  It  also  al¬ 
lows  a  request  to  build  a  new  AST,  add  one  to 
the  existing  tree,  or  controlling  the  granularity  of 
the  AST  on  one  pass-algorithms  (Section  2.4). 

4.  Matching  and  transformation  language: 
This  layer  provides  high-level  linguistic  mecha¬ 
nisms  for  examining  and  transforming  the  AST. 
These  mechanisms  include  the  abstract  inter¬ 
face  to  the  AST,  memory  management,  dynamic 
typing,  and  goal-directed  pattern  matching,  in 
addition  to  more  standard  language  constructs. 
Straightforward  translation  to  C  supports  access¬ 
ing  existing  code  such  as  graphics  libraries. 

5.  User  script;  The  highest  layer  of  the  tool  is  the 
tool  user’s  script,  written  in  the  matching  and 
transformation  language.  A  script  implements 
a  specific  program  understanding  or  transforma¬ 
tion  task  such  as  determining  set-use  information 
or  performing  dead-code  elimination.  A  sophis¬ 
ticated  user  can  write  his  or  her  own  script,  but 
a  novice  user  may  perform  only  simple  actions  or 
borrow  scripts  from  others. 

Note  that  the  AST  module  spans  two  non-adjacent 
layers  (1  and  3).  This  division  permits  controlling  the 
granularity  of  AST  construction  without  introducing 
circularities  between  the  layers  or  forcing  the  AST  into 
two  different  modules  [11][20]. 

2.6  Generic  infrastructure 

In  order  to  amortize  the  costs  of  building  an  in¬ 
stance  of  this  tool  for  a  specific  language,  it  is  advan¬ 
tageous  to  develop  a  generic  tool  infrastructure — like 
that  provided  by  TXL  and  REFINE — that  can  be  eas¬ 
ily  adapted  to  a  specific  language.  Thus  there  are  re¬ 
ally  two  tools,  the  generic  framework  for  creating  a 
tool  for  a  specific  language,  and  the  tool  for  a  specific 
language. 

The  generic  tool  includes  the  scripting  language, 
the  UNIX  tools  lex  and  yacc,  the  yacc-preprocessor 
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that  provides  concise  parsing  <lirertives  for  AST  con¬ 
struction,  and  the  symbol  table  and  AST  datatypes. 
The  tool  retargeter  is  responsible  for  providing  the 
standard  lexical  and  syntactic  descriptions  to  lex  and 
yacc,  along  with  the  appropriate  AST  node  and  sym¬ 
bol  table  entry  definitions,  AST  linking  directives  and 
symbol  table  actions  for  building  an  AST  for  a  pro¬ 
gram  in  the  target  language.  Finally,  the  retargeter 
must  design  an  appropriate  abstract  interface  to  the 
AST,  which  becomes  the  set  of  built-in  procedures  of 
the  matching  and  transformation  language.  This  in¬ 
terface  needs  to  be  carefully  chosen  to  faithfully  con¬ 
vey  the  language  syntax  and  to  allow  easy  access  to 
nested  constructs. 

3  Discussion 
3.1  Design  history 

Our  early  efforts  in  designing  this  tool  focused  on 
discovering  the  tradeoffs  between  performance  and 
flexibility.  Our  first  attempt  at  implementing  a  parser 
and  ASTs  used  Icon  [9].  We  chose  Icon  because  of 
its  support  for  goal-directed  pattern  matching,  flexible 
aggregate  data  structures,  dynamic  typing,  garbage 
collection,  rapid  edit-compile-execute  cycle,  and  opti¬ 
mizing  compiler.  Additionally,  we  believed  it  could  be 
more  efficient  and  expressive  than  Common  Lisp  for 
the  tasks  we  wanted  to  perform.  However,  we  found 
that  yacc-like  bottom-up  parsing  in  Icon  was  too  slow 
and  the  resulting  code  was  difficult  to  compile.  Like¬ 
wise,  although  the  ASTs  were  reasonably  compact, 
they  were  still  twice  as  large  as  ASTs  implemented 
in  C.  Overall,  Icon  was  about  5-10  times  slower  than 
what  we  finally  achieved  in  C. 

After  this  failed  attempt,  we  realized  that  perfor¬ 
mance  outweighed  most  of  our  concerns  about  flexi¬ 
bility.  Much  of  the  flexibility  we  desired  is  available 
by  using  lex  and  yacc,  whose  declarative  syntax  eases 
making  small  changes  during  development.  Likewise, 
we  found  that  the  analogous  structures  of  the  AST 
allowed  implementing  ASTs  in  a  relatively  declarative 
fashion. 

Once  we  implemented  the  basic  tool,  we  sought  ad¬ 
ditional  techniques  to  speed  processing.  For  example, 
some  queries  are  simple  enough  that  they  can  be  per¬ 
formed  during  parsing  without  an  AST.  Hence,  we 
wanted  to  provide  a  mechanism  for  specifying  that  a 
query  should  occur  as  part  of  a  parsing  action.  How¬ 
ever,  we  found  the  approach  impractical.  The  rules 
that  would  have  to  be  added  to  the  grammar  to  accom¬ 
modate  the  on-the-fly  processing  either  complicated 


th*>  grammar  unacceptably  or  introduced  parsing  con¬ 
flicts  that  could  not  be  resolved  satisfactorily  by  yacc’s 
LALR(l)  parsing  strategy.  Consequently,  we  settled 
on  the  the  approach  of  interleaving  processing  with 
AST  construction  (Section  2.4). 

The  problem  of  inserting  processing  actions  into  the 
grammar  is  an  example  of  a  broader  class  of  issues  cre¬ 
ated  by  using  standard,  highly  optimized  tools  such 
its  lex  and  yacc.  For  one,  these  tools  were  not  created 
with  tasks  such  as  AST  printing  and  AST  interface 
design  in  mind.  For  example,  both  lexical  and  syn¬ 
tactic  information  is  needed  to  create  an  AST  printer. 
However,  because  lex  and  yacc  are  separate  tools,  and 
some  of  their  processing  is  handled  by  C  code  rather 
than  declarations,  insufficient  information  is  available 
to  successfully  automate  the  generation  of  an  AST 
printer.  Also,  their  underlying  algorithms  and  input 
syntax  were  designed  primarily  for  performance  at  the 
expense  of  other  needs.  For  instance,  a  yacc  grammar 
is  best  written  in  a  left-recursive  fashion  to  minimize 
stack  depth.  Unfortunately,  such  an  encoding  of  a 
grammar  does  not  necessarily  yield  a  grammar  that 
is  suitable  for  automatic  derivation  of  abstract  inter¬ 
faces. 

Our  experience  so  far  indicates  that  the  time  re¬ 
quired  to  program  components  such  as  AST  printing 
and  the  AST  interface  is  acceptable,  especially  since 
they  are  programmed  just  once  for  each  retarget  of 
the  tool  to  a  new  language.  On  the  other  hand,  once 
the  tool  is  retargeted,  it  will  be  used  repeatedly  in 
contexts  where  fast  performance  is  paramount. 

3.2  Performance  results 

On  a  Sun  Sparc  10  model  41,  with  96MB  of  RAM 
and  500MB  of  swap  space,  the  resulting  prototype 
parses  MUMPS  code  at  250,000  lines  per  minute.  It 
can  parse  code  and  produce  an  AST  at  67,000  lines 
per  minute.  If  the  AST  is  discarded  as  it  is  traversed 
(See  Section  2.4),  our  tool  can  proceed  at  1 10,000  lines 
per  minute  due  to  reduced  paging  overhead.  An  AST 
node  requires  24  bytes  (for  application  data,  pointers 
to  parent,  children,  and  siblings).  Our  tool  can  con¬ 
struct  an  AST  for  CHCS — consisting  of  600,000  lines 
of  code — in  about  9  minutes  using  250MB  of  storage. 
REFINE,  with  its  current  technology,  parses  code  and 
builds  ASTs  at  about  1200  lines  per  minute.  With  es¬ 
timates  based  on  a  Sparc  10  with  500MB  of  RAM,  it 
could  construct  a  full  AST  for  CHCS  in  about  8  hours 
using  about  600MB  of  storage. 

A  simple  set-use  computation  on  CHCS,  which  can 
be  performed  at  the  granularity  of  a  MUMPS  com¬ 
mand,  takes  10  minutes  (about  6  minutes  for  parsing 
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and  4  minutes  for  the  set-use  computation).  At  this 
granularity,  the  AST  never  exceeds  4KB  of  storage. 
In  a  scenario  in  which  there  are  many  programmers 
on  a  project,  this  permits  several  programmers  on  the 
same  machine  to  run  one-pass  scripts  simultaneously. 
Although  the  time  to  parse  and  reconstruct  the  ASTs 
dominates  for  simple  operations,  it  is  small  enough  to 
permit  reconstructing  the  AST  for  each  script.  When 
several  script  runs  are  anticipated,  it  is  possible  that 
building  the  whole  AST  and  then  running  scripts  in¬ 
teractively  inside  our  tool  might  provide  better  per¬ 
formance.  For  a  scenario  with  several  programmers,  a 
client-server  approach  to  allow  sharing  the  AST  might 
be  best.  However,  it  is  difficult  to  assess  how  effective 
these  techniques  will  be  in  the  typical  case  because 
of  the  tradeoff  between  reconstructing  the  AST  and 
swapping  due  to  the  increased  memory  requirements. 
The  amount  of  RAM  available  and  the  anticipated  size 
of  the  data  structures  are  key. 

The  performance  of  our  tool  is  adequate  for  per¬ 
forming  “what-if  queries  and  repeatedly  refining 
queries  to  converge  on  a  desired  solution.  For  in¬ 
stance,  the  set-use  computation  was  refined  from  a 
naive  script  into  a  usable  form  over  several  iterations. 
If  the  script  were  tested  on  the  whole  system  (which 
is  necessary  in  some  cases  to  reveal  certain  omissions 
and  performance  problems),  the  script  could  be  tested 
6  times  per  hour  (not  including  coding  time). 

4  Conclusion 

Automating  semantically-precise  techniques  for 
manipulating  programs  can  help  maintain  large  sys¬ 
tems,  but  they  can  be  too  costly  to  implement  or 
use.  An  alternative  is  to  use  a  simpler,  faster,  syntax- 
directed  tool  that  captures  much  of  the  needed  infor¬ 
mation.  However,  missing  information  must  be  pro¬ 
vided  by  the  tool  user,  perhaps  by  repeatedly  running 
the  tool  with  a  variety  of  related  queries.  Designing 
such  a  tool  with  enough  performance  and  flexibility 
to  offset  its  semantic  shortcomings  is  difficult.  How¬ 
ever,  by  taking  advantage  of  the  theory,  experience, 
and  tools  of  traditional  compiler  and  language  design, 
this  goal  can  be  met. 

We  have  implemented  the  lower  layers  of  our  tool 
but  are  still  designing  the  programming  layer.  Once 
we  have  completed  our  tool,  then  we  can  verify  our 
hypothesis  about  how  programmers  and  designers  can 
use  domain  and  design  information  to  permit  using 
syntactic  analysis  for  cheaply  ac.(|uiring  and  using  se¬ 
mantic  information.  Use  of  the  tool  on  CIICS  will 
be  used  in  the  evaluation.  However,  our  early  re.sults 


suggest  that  using  traditional  compiler  techniques  in 
an  established  technology  base  can  provide  substantial 
performance  benefits  and  ease  of  use. 
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ABSTRACT;  The  SF  (Set-Function)  specification  and 
prototyping  language  has  numerous  properties  that 
make  it  an  attractive  intermediate  language  for  the  reen¬ 
gineering  of  embedded  control  systems.  An  existing 
control  system  is  mapped  to  SF.  changes  are  made  to 
the  SF  prototype,  and  the  new  prototype  is  then 
transformed  into  an  implementation  in  the  language  of 
the  user’s  choice.  We  show  how  a  fairly  major  redesign 
task  to  an  elevator  controllo'  was  very  rtqtidly  intro¬ 
duced  into  the  SF  prototype. 

1.  Introduction 

In  general,  software  controls  a  process,  provides  access 
to  information,  or  changes  the  form  of  data,  and  we 
qteak  of  a  control  system,  an  information  system,  or  a 
data  transformer.  Examples  of  the  three  kinds:  an 
elevator  controller,  an  inventory  information  system, 
and  a  program  that  transforms  cartesian  coordinates  into 
polar  coordinates.  Most  systems  are  composite 
software-hardware  hybrids,  which  may  be  supervised 
by  a  human  operator.  For  example,  a  milling  machine 
transforms  a  piece  of  metal,  but  is  controlled  by 
software  selected  by  an  operator,  the  software  in  turn 
refers  to  tables  in  an  information  base,  and  this  informa¬ 
tion  has  to  be  put  into  an  t^prqniate  form  by  a  data 
transformer.  For  simplicity,  here  we  shall  restrict  our¬ 
selves  to  systems  that  consist  of  a  software  controller 
embedded  in  a  hardware  device  to  be  controlled.  We 
shaU  refer  to  the  two  components  of  the  system  as  con- 
trollo-  and  host. 

Even  when  discussion  is  limited  to  controller-host 
systems,  numerous  types  of  change  that  necessitate 
reengineering  have  to  be  considered.  We  list  three 
broad  categories  of  such  changes. 

•  Shift  of  the  controller-host  boundary.  The 
boundary  is  shifted  to  alter  the  balance  between  system 
reliability  and  system  performance.  If  more  functions 


of  the  combined  system  are  taken  over  by  the  host, 
functional  reliability  can  be  improved  because  duplica¬ 
tion  of  hardware  components  allows  hardware  reliabil¬ 
ity  to  be  increased  to  an  arbitrarily  high  level.  On  the 
other  hand,  if  perfonnance  needs  to  be  improved,  the 
boundary  has  to  be  shifted  in  the  other  direction. 

•  Controller  enhancement  An  existing  system  is 
enhanced  when  more  of  the  tasks  that  were  originally 
performed  by  human  operators  or  were  not  performed  at 
all  are  taken  over  by  the  controller. 

•  Host  modification.  An  existing  host  is 
modified,  necessitating  a  change  in  the  controller.  This 
is  a  common  problem  facing  develqrers  of  controllers 
under  concurrent  engineering,  but  often  modifications 
continue  to  take  place  after  release  of  the  system.  Host 
replacement  by  a  new  model  is  an  extreme  case  of 
modification. 

The  reengineering  process  can  be  applied  directly 
to  existing  code,  e.g.,  a  controller  written  in  a  mix  of 
Fortran  and  assembler  language  may  be  adapted  to  new 
conditions  using  these  same  languages,  but  we  carmot 
recommend  such  an  ^rproach.  There  are  several  rea¬ 
sons  why  it  is  preferable  to  derive  a  specification  from 
the  existing  code,  make  the  required  changes  in  this 
specification,  and  then  translate  the  specification  into 
executable  code.  First,  the  existing  software  has  to  be 
thoroughly  understood  before  a  change  is  made.  Pro¬ 
gram  unckrstanding  has  been  estimated  to  account  for 
40%  of  maintenance  costs  [1].  Although  there  exist 
tools  to  aid  in  program  understanding  [2],  and  these 
tools  could  in  principle  be  used  to  help  a  software  reen¬ 
gineering  team  to  make  changes  in  the  existing  software 
directly,  it  has  to  be  established  that  the  altered  code 
corresponds  to  sponscH'  expectations.  This  is  easier 
done  in  specifications  than  in  code.  Second,  when  the 
change  also  entails  a  language  switch,  e.g.,  from  Jovial 
to  A^.  the  need  for  thorough  validation  by  the  sponsor 
assumes  even  greater  importance.  Third,  once  a 
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specification  has  been  derived,  future  modifications  will 
be  much  easier  to  carry  out  and  validate  in  the 
specification  than  in  code.  Hence  the  writing  of 
specifications  can  be  regarded  as  preventive  mainte¬ 
nance.  This  presumes,  though,  that  the  specification 
language  is  impropriate  for  the  task. 

2.  Properties  of  specification  languages 

In  what  follows  we  shall  be  concerned  with  formed 
specifications  of  software  systems,  i.e.,  specifications 
written  in  a  language  with  a  well-defined  syntax  and 
semantics.  A  formal  specification  is  a  bridge  between 
system  sponsors  and  the  people  who  develop,  maintain, 
and  reengineo'  the  system.  A  formal  specification  of  a 
contFoUer  must  be  an  unambiguous  statement  of  the 
expectations  of  the  sponsors,  but  expressed  in  such  a 
way  that  they  can  rntfice  sure  that  their  expectations  are 
in  fact  being  met.  It  follows  that  the  mode  of  presenta¬ 
tion  of  a  formal  specification  has  to  satisfy  four  criteria; 
requirements  must  be  expressed  unambiguously,  they 
must  be  verifiable,  they  must  be  understandsitle  by 
sponsor  representatives  with  little  effort,  and  their  form 
must  allow  a  sponsor  representative  to  indicate  clearly 
any  changes  that  are  to  be  made  to  the  specifications. 
Simply  put,  specifications  must  be  at  the  same  time  pre¬ 
cise  and  readable,  and  this  is  difficult  to  achieve.  We 
shall  nevertheless  attempt  to  achieve  both  goals  with 
SF.  to  be  described  further  down.  This  language  will  be 
found  to  satisfy  the  following  ten  prqterties  of 
specification  languages.  The  properties  ate  consistent 
with  the  principles  of  specification  and  prototyping 
languages  listed  in  [3-6]. 

1.  The  language  is  to  be  totally  unambiguous  so 
that  system  developers  and  maintainers  have  a  precise 
understanding  of  the  system.  This  means  that  the 
language  must  have  sound  theoretical  foundations. 

2.  The  language  is  to  allow  a  reading  knowledge 
to  be  acquired  with  little  training.  There  is  to  be  an 
economy  of  concepts. 

3.  The  language  is  to  allow  the  specification  of  a 
conceptual  model  of  a  controller  without  having  to 
define  a  detailed  design  ot  implementation  model. 
However,  it  has  been  found  that  some  intertwining  of 
specification,  design,  and  implementation  of  software  is 
inevitable  [7-9]. 

4.  The  language  is  to  encourage  development  of  a 
specification  in  the  form  of  modules.  Of  the  ten  most 
serious  sources  of  risk  in  the  software  development  pro¬ 
cess,  seven  can  be  managed  by  increment^  modular 
software  development  [10].  Modularization  based  on 


data  types  is  a  natural  choice. 

5.  The  language  is  to  require  that  the  specification 
of  a  module  consists  of  distinct  parts  that  deal  with  the 
structural  aspects  of  its  information  base,  its  behavioral 
aspects,  and  the  actual  control  process.  This  contributes 
to  the  clarity  of  a  specification  and  allows  the 
specification  of  a  module  to  be  developed  by  several 
members  of  a  reengineering  team  working  in  parallel. 

6.  The  language  is  to  enforce  the  principles  of 
data  abstraction  -  a  data  type  is  to  consist  of  a  set  and  of 
functions  defined  on  this  set. 

7.  The  language  is  to  allow  easy  representation  of 
integrity  constraints  on  the  information  base  of  a 
module. 

8.  The  language  is  to  provide  mechanisms  for 
linking  individual  processing  tasks  into  conuol 
processes,  and  is  to  allow  easy  definition  of  interactions 
of  a  controller  with  its  host,  including  real-time  interac¬ 
tions. 

9.  The  language  is  to  be  supportive  of  an  evolu¬ 
tionary  development  of  specifications,  i.e.,  it  should 
allow  easy  modification  and  augmentation  of  an  initially 
incomplete  specification.  Under  reengineering  this 
means  that  the  specification  can  be  derived  from  exist¬ 
ing  controller  software  in  incremental  steps. 

10.  The  language  is  to  support  reuse.  For  exam¬ 
ple,  it  should  not  requite  much  effort  to  convert  the 
specification  of  an  existing  controller  into  a  controller 
for  a  similar  application. 

3.  An  introduction  to  SF 

The  specification  and  prototyping  language  SF  (short 
for  Set-Function)  has  been  developed  over  the  past  eight 
years.  The  language  was  introduction  in  1986  [5];  a 
later  vision  (somewhat  simplified  since  then)  is 
described  in  [11];  examples  of  SF  specifications  can  be 
found  in  [12].  A  specification  in  the  SF  (Set-Function) 
language  is  composed  of  segments,  which  correspond  to 
data  types.  A  segment  has  three  parts:  a  schema  that 
describes  an  information  base,  events  that  change  the 
information  base,  and  a  control  component,  which  con¬ 
sists  of  transactions  that  allow  events  to  be  joined 
together  into  a  process.  The  schema  definition  identifies 
the  set  of  interest  for  the  segment  and  defines  functions 
that  map  firom  this  set  A  control  process  is  built  up  by 
means  of  signals  that  are  either  "on"  or  "off",  and  that 
link  events  and  transactions  within  and  across  segments. 
Events  switch  signals  on;  transactions  switch  them  off. 
A  transaction  may  be  implemented  in  software  or 
hardware,  thus  providing  a  high  degree  of  flexibility.  A 
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signal  may  be  a  carrier  of  additional  information  that  is 
to  be  sent  from  one  segment  to  another,  or  merely  from 
one  event  to  another.  Hence  switching  on  a  signal  is  in 
effect  the  sending  of  a  message. 

We  shall  use  a  library  system  to  provide  us  with 
an  example  of  a  schema.  We  do  so  because  schemas 
for  realistic  controller-host  systems  are  much  too  large 
for  an  introductory  exposition  (a  schema  for  an  elevator 
controUer  can  be  found  in  [12]).  Moreover,  we  shall  at 
times  deviate  from  the  rules  of  SF  to  avoid  getting  dis¬ 
tracted  by  technical  detail  of  little  relevance  in  such  an 
exposition.  There  can  be  sevoal  copies  of  the  same 
title  in  the  library,  so  there  is  a  need  for  a  copies  and  a 
titles  segment  There  is  also  a  borrowers  segment 
Only  the  schema  for  copies  will  be  considered  here. 

SEGMENT  Copies; 

IMPORTEDSIGNAL  TitleFixed{Copy.  Title); 

SIGNALS 

CiUalogCopyiCopy. Title); 

AddTitleiCopy,  Title); 

AdjustCount{Borrower.  Integer); 

TYPE  Copy, 

SET  C  (SUBSETS;  AV.  CO.  BR.  L.  R); 

FUNCTIONS 

BookJd;  C  Title; 

History;  C  — *  (Borrower  x  £)ar«)-set ; 

ENDTYPE; 

The  set  for  data  type  Copy  is  C,  and  it  is  parti¬ 
tioned  into  subsets  AV,  CO,  BR,  L,  and  R.  A  copy  is  in 
the  corresponding  subset  if  it  is  available  for  borrowing, 
checked  out  being  repaired,  lost  or  removed,  respec¬ 
tively.  For  each  copy,  function  Bookid  tells  to  what  title 
this  c(^y  belongs,  and  History  gives  its  complete  bor¬ 
rowing  history.  Three  signals  originate  in  this  segment 
and  one  is  imported  into  it  (from  segment  Titles).  Sig¬ 
nal  AddTitle  is  raised  when  the  first  copy  of  a  title 
arrives  in  the  library,  and  there  no  information  about 
this  title  in  the  catalog.  The  Titles  segment  picks  up  the 
signal,  updates  the  catalog,  and  sends  back  signal  Title- 
Fixed,  which  initiates  the  introduction  of  the  copy  infor¬ 
mation  into  the  catalog.  Signal  CatalogCopy  has  the 
same  effect  as  TitleFixed,  but  is  raised  if  this  is  not  a 
first  copy,  i.e.,  it  is  raised  in  the  Copies  segment  itself  if 
the  title  information  is  already  in  the  catalog.  The 
remaining  signal  causes  adjustment  of  the  count  of 
copies  held  by  a  borrower. 

The  specification  of  events  consists  of  precondi¬ 
tions  and  postconditions.  Preconditions  determine 
under  what  circumstances  an  event  may  take  place. 
They  serve  as  a  check  on  the  feasibility  of  input  values. 


and  embody  integrity  criteria  for  the  information  base  of 
the  segment  Postconditions  are  divided  into  datacondi- 
tions  and  sigconditions.  Dataconditions  indicate  the 
changes  that  sets  and  maps  undergo  in  consequence  of 
an  event  taking  place.  Sigconditions  send  signals  to 
transactions.  An  event  is  regarded  as  atomic,  i.e.,  the 
objects  with  which  the  event  deals  are  not  accessible  to 
other  events  until  the  event  is  done  with  them.  We  give 
two  events  of  an  elevator  controller  as  examples.  Event 
Moveldle  is  initiated  by  a  dispatcher  segment  Its  pur¬ 
pose  is  to  move  an  idle  elevator  e  to  a  holding  floor  / 
selected  by  the  dispatcher.  The  elevator  has  an  agenda 
of  floors  to  visit.  For  an  idle  elevator  the  agenda  has 
been  empty,  and  it  now  becomes  the  set  (  /  ).  The 
elevator  goes  firom  the  idle  state  into  a  halt  state,  and 
signal  S  SetInMotion  will  cause  it  to  be  set  in  motion 
toward  floor  /.  Note  that  there  are  two  halt  states:  from 
one  the  next  state  transition  will  be  into  an  upward 
motion  state;  from  the  other  it  will  be  into  a  downward 
motion  state. 

When  a  set  or  a  function  is  changed  as  a  result  of 
an  event,  the  prime  symbol  denotes  the  changed  object 
Equation  S'  =  SopX  tells  that  S the  set  after  the  event 
is  S,  the  set  before  the  event,  modified  by  X  accoid^ig  to 
the  operation  op.  With  equation  fun'(x)  =  t  there  can 
be  two  cases.  If  function  ^  already  contains  some  pair 
<x,  s>,  then  <x,  t>  rq>laces  this  pair,  if  not  then  pair  <x, 
t>  is  simply  added  to  the  frmction. 

EVENT  MoveIdle{e;  Elevator,/:  Floor); 

DATACONDITIONS 
Agenda'ie)  =  {/]; 

f>  FloorNowie)  — ►  Staie'ip)  =  "uphalt"; 

/ <  FloorNow(e)  — >  State'{e)  =  "dhalt"; 

SIGCONDmONS 
(S  ~SetInMoiion(,e))Ofi; 

ENDEVENT; 

EVENT  OpenDoorie:  Elevator); 

PRECONDITIONS 
Staieie)  e  ("idle",  "uphalt",  "dhalt"); 

DATACONDITIONS 
Clock'{e)  =  Time.  Now, 

SIGCONDITIONS 

(S'ProcessHalt{e))ON; 

{DoorOpen{e))ON; 

ENDEVENT; 

Event  OpenDoor  is  essential  so  that  people  can 
get  out  who  somehow  find  themselves  in  an  idle  eleva¬ 
tor;  raising  the  flag  S'ProcessHalt  ensures  that  the 
opened  elevator  door  will  ultimately  close  again.  An 
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event  takes  place  only  if  all  its  preconditions  are 
satisfied.  Here  we  have  just  one  precondition:  the 
elevator  is  to  be  idle  or  in  a  halt  state,  i.e.,  pressing  the 
"Open  Door"  button  has  no  effect  if  the  elevator  is  in 
motion.  The  purpose  of  Clock  is  to  measure  the  time 
for  which  the  elevator  door  is  to  remain  open; 
Time.  Now  is  the  identifier  of  a  function  Now,  which 
belongs  to  type  Time  and  returns  the  current  clock  read¬ 
ing. 

Let  us  now  look  at  what  happens  to  signals.  A 
signal  is  raised  by  an  event,  and  is  then  picked  up  by  a 
transaction  in  the  same  or  another  segment.  A  signal 
remains  alive  until  it  is  explicitly  turned  off.  A  signal 
may  be  raised  in  just  one  segment,  which  we  call  its 
home  segment,  but  it  can  be  turned  off  in  more  than  one 
segment,  by  any  applicable  transaction  that  picks  up  the 
signal  first  The  need  for  signals  to  have  a  home  seg¬ 
ment  was  the  reason  for  having  the  two  signals 
CatalogCopy  and  TitleFixed  in  the  library  example. 
Suppose  segment  A  wants  segment  B  to  perform  some 
action.  It  makes  the  request  by  raising  a  signal.  This 
signal  is  exported  by  segment  A  and  imported  by  seg¬ 
ment  B,  where  it  triggers  a  transaction.  Our  example  of 
a  transaction  is  triggered  by  a  signal  that  originates  in 
the  dispatcher  segment,  and  it  initiates  an  event  in  the 
elevator  segment 

TRANSACTION; 

@(  ):  ON{S-MoveIdle{e.floor))GPP: 

Moveldleie,  floor)', 

ENDTRANSACnON; 

Consider  the  components  of  this  transaction.  All 
transactions  have  times  associated  with  them.  Thus 
@(17;  IS)  indicates  that  every  day  at  5:1  S  pm  a  check  is 
made  whether  the  transaction  so  marked  is  enabled. 
The  time  mariter  @(  )  requires  the  check  to  be  made 
continuously.  Transactions  are  enabled  if  the  signals  in 
their  definitions  are  on,  or  if  no  signal  is  shown.  The 
signals  are  switched  off,  and  actions  are  performed.  In 
our  example  the  action  is  the  automatic  initiation  of  an 
event  Actually,  @(a)  is  an  abixeviation  for  @{a,  a), 
and  @(a,  b)  indicates  that  the  transaction  must  take 
place  within  the  time  interval  (a,  b);  @(  )  is  an  abbrevi¬ 
ation  for  @(Jime.  Now,  Time.  Now). 

Sometimes  a  situation  arises  in  which  a  transac¬ 
tion  realizes  that  an  event  is  to  take  place,  but  cannot 
initiate  the  event  because  the  necessary  arguments  have 
to  be  supplied  by  a  human  operator.  It  then  prompts  the 
operator  to  initiate  the  event.  A  third  alternative  is  a 
reminder  -  a  transaction  can  remind  a  human  operator 
of  anything  at  all. 


The  separation  of  the  action  of  sending  out  a  mes¬ 
sage  (i.e.,  switching  on  a  signal  as  part  of  an  event) 
from  the  definition  of  the  ultimate  effect  of  this  message 
(in  a  transaction,  which  may  reside  in  a  different  seg¬ 
ment)  has  several  attractive  features.  First,  events  can 
be  defined  independently  of  transactions,  and  all  deci¬ 
sions  regarding  the  precise  effect  of  a  message  can  be 
postponed.  Second,  because  messages  are  not 
addressed,  more  than  one  transaction  can  pick  up  a  mes¬ 
sage,  which  adds  to  the  flexibility  of  the  entire  system. 
For  example,  the  availability  of  a  resource  can  be  made 
known  to  several  processes  that  may  wish  to  use  this 
resource,  without  naming  the  processes.  Third,  all  indi¬ 
cations  of  the  time  dqrendence  of  the  effect  of  a  mes¬ 
sage  belong  to  the  definition  of  transactions,  i.e.,  the 
specification  of  time-related  aspects  of  the  system  is 
confined  to  transactions.  Fourth,  all  iterative  actions  are 
confined  to  transactions,  which  means  that  events  are 
given  a  particularly  simple  structure. 

Also  in  the  interests  of  simplicity  we  have  elim¬ 
inated  a  nun^ber  of  features  that  were  to  be  found  in  ear¬ 
lier  versions  of  SF,  in  particular  sensors  and  mechan¬ 
isms.  A  sensor  is  a  device  that  supplies  information, 
e.g.,  a  thermocouple  siqtplies  a  temperature  reading.  As 
regards  an  elevator,  an  indication  of  the  floor  at  which  it 
finds  itself  could  be  provided  by  a  hardware  sensor  or 
by  a  software  function  that  is  adjusted  each  time  the 
elevator  passes  a  floor.  Therefore,  at  a  sufficiently  high 
level  of  abstraction,  there  is  no  essential  difference 
between  sensors  and  functions.  A  mechanism  is  a  dev¬ 
ice  that  interfaces  with  the  external  world,  e.g.,  causes 
an  elevator  door  to  be  opened.  Again,  there  is  no  essen¬ 
tial  difference  between  a  signal  that  causes  a  software 
change  and  a  mechanism  that  causes  hardware  change. 
We  therefore  eliminated  the  latter,  and  DoorOpen 
became  a  signal. 

Prompting  transactions  point  out  tasks  that  are 
candidates  for  automation.  A  prompting  transaction 
knows  that  an  event  is  to  take  place,  but  cannot  supply 
all  its  arguments.  If  the  supplying  of  the  arguments  is 
taken  over  by  an  expm  system,  the  transaction  can  ini¬ 
tiate  the  event  on  its  own.  If  all  the  activities  of  an 
organization  are  specified  in  SF,  then  prompting  tran¬ 
sactions  indicate  all  opportunities  for  the  introduction  of 
expert  systems,  in  each  instance  a  cost-benefit  analysis 
can  be  made,  the  qrportunities  ranked  according  to  the 
results  of  the  analysis,  and  expert  systems  developed 
incrementally  in  order  of  the  ranking.  If  full  automation 
is  not  called  for,  a  decision  support  system  can  be  made 
to  assist  the  human  decision  maker  arrive  at  the 
appropriate  inputs  to  the  event  to  be  initiated.  When 
full  or  partial  automation  is  introduced  by  means  of  an 
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expert  system  or  a  deer  m  support  system,  the  new 
process  will  probably  be  more  complex  than  the 
approach  it  replaces.  Reengineering  does  not  always 
mean  simplification  -  usually  it  means  increased  effec- 
tivene.ss,  but  this  may  require  a  more  complex  process 
structure. 

The  only  way  to  remove  ambiguity  firom 
specifications  is  to  use  a  language  based  in  mathematics. 
Necessarily,  this  is  also  the  case  with  SF,  but  we  have 
tried  to  make  the  mathematics  not  too  foibidding.  Thus, 
although  all  preconditions  and  dataconditions  are  asser¬ 
tions  in  logic,  the  notation  is  not  obtrusive.  The  asser¬ 
tions  use  operations  on  sets,  finite  funclions,  and  numer¬ 
ical  and  bmlean  data  types.  All  such  opeiadons  have 
well  understood  mathematical  definitions.  Signals 
relate  events  and  transactions:  only  an  event  can  switch 
on  a  signal;  only  a  transaction  can  switch  it  off.  This 
follows  the  Petri  net  formalism  in  which  places  are 
linked  with  transitions,  but  no  two  places  or  two  transi¬ 
tions  may  be  directly  linked.  The  nets  are  actually  vari¬ 
ants  of  time  Petri  nets  [13,  14]  because  of  the  use  of 
time  markm  of  the  fom  @(a,  b).  Manipulation  of  sig¬ 
nals  corresponds  to  the  movement  of  tokens.  The 
theory  of  Petri  nets  thus  provides  mathematical  founda¬ 
tions  for  SF  processes. 

4.  Categories  of  change  and  SF 

In  Section  1  we  introduced  three  categories  of  change: 
shift  of  controller-host  boundary,  controller  enhance¬ 
ment,  and  host  modification.  We  shall  now  relate  the 
features  of  SF  to  these  categories.  Under  the  first 
category  the  total  system,  as  seen  by  an  outside 
observer,  has  unchanged  functional  capabilities,  but  the 
ctqtabilities  of  its  software  component  have  either 
increased  or  decreased. 

Let  us  consider  an  example.  The  determination 
of  the  current  position  of  an  elevator  can  be  established 
by  means  of  a  hardware  floor  indicator  or  by  counting 
floors  as  the  elevator  moves  up  and  down.  The  SF 
schema  contains  the  function  FloorNow,  mapping  from 
elevatcMS  to  integers.  If  the  values  of  this  function  are 
supplied  by  hardware  floor  indicators  (sensors),  then  the 
making  of  changes  to  this  function  is  no  concern  of  the 
software  reengineering  team.  But  suppose  now  that  in 
the  future  the  values  of  FloorNow  are  to  be  adjusted  by 
software.  We  stipulate  that  there  exist  sensors  between 
floors  in  every  elevator  shaft,  and  that  a  signal 
NextFloorSensor(e)  is  raised  whenever  elevator  e 
passes  one  of  the  sensors.  In  the  specification  of  the 
elevator  controller  this  signal  initiates  the  event  Pas- 
singSensor.  Under  the  present  design  the  agenda  of  the 


elevator  is  consulted,  and,  if  the  next  floor  in  the  direc¬ 
tion  of  motion  of  the  elevator  is  on  its  agenda,  the  eleva¬ 
tor  will  be  halted  at  the  floor.  Under  the  new  design 
these  actions  still  need  to  be  performed,  but  now,  in 
addition,  the  value  of  FloorNow  is  changed:  if  elevator 
e  is  moving  up,  the  value  of  this  function  for  e  is  incre 
mented;  if  it  is  moving  down  the  value  is  decremented. 
These  changes  are  defined  by  the  assertions 

State(e)  =  "up"  — ► 

Float  No^ ie)  =  FloorNow(e)  +  1  ; 

Stateie)  =  "down"  — ► 

FloorNow\e)  =  FloorNow{e)  -  1 ; 

As  a  design  rule,  we  recommend  that  initially  as 
much  as  possible  of  the  system  be  specified  in  the  form 
of  software  capabilities.  Then  the  event  PassingSensor 
would  contain  the  assertions  that  define  the  changes  in 
value  of  FloorNow  from  the  very  beginning,  and,  in 
implementing  the  design  under  which  the  value  of 
FloorNow  is  supplied  by  a  hardware  indicator,  these 
assertions  would  be  changed  into  comments.  This  has 
two  advantages.  First,  these  assertions-comments 
express  the  semantics  of  FloorNow.  Second,  if  a 
elmge  from  a  hardware  to  a  software  implementation  is 
to  be  made,  as  we  were  considering  it  above,  this  is 
accomplished  by  changing  the  comments  back  into 
assertions.  It  seems  that  whenever  a  function  can  be 
implemented  by  hardware  or  software,  ultimately  there 
is  a  need  for  sensors.  Thus,  the  software  implementa¬ 
tion  of  FloorNow  still  depends  on  next-floor-sensors. 
Even  if  the  value  of  FloorNow  were  to  be  worked  out 
by  reference  to  the  time  an  elevator  is  in  actual  motion, 
this  would  require  consulting  a  clock,  which  is  again  a 
kind  of  sensor. 

Controller  enhancements  do  not  affect  the 
hardware-software  interface,  i.e.,  every  change  is  either 
a  modification  of  existing  software  or  an  addition  to 
existing  software.  This  means  that  there  are  not  to  be 
any  changes  to  functions  that  are  implemented  as  sen¬ 
sors,  signals  that  activate  mechanisms,  and  sensors  that 
raise  signals.  Let  us  consider  the  system  components 
that  may  change.  The  structure  of  the  information  base 
is  very  likely  to  change  with  the  addition  of  entire  new 
segments,  changes  in  subset  partitions  of  existing  sets, 
or  addition  of  new  functions  in  existing  segments.  The 
most  radical  change  is  the  addition  of  new  segments, 
but,  because  of  communication  by  signals,  this  does  not 
affect  existing  segments  to  any  great  extent.  It  may 
happen,  though,  that  the  new  segment  needs  information 
that  has  not  been  gathered  in  the  past,  and  that  the  logi¬ 
cal  place  for  holding  some  of  this  information  is  in  the 


segments  already  there.  Wherever  the  new  functions 
may  be  located,  they  are  likely  to  undergo  changes,  and 
the  changes  have  to  be  made  explicit  in  the  definition  of 
events.  Moreover,  new  processes  may  have  to  be 
defined,  which  is  done  by  stringing  together  events. 

Although  usually  changes  are  enhancements, 
there  can  be  exceptions.  Suppose  that  our  library  is  to 
be  "downsized",  meaning  that  in  the  future  books  will 
no  longer  be  lent  ouL  The  changes  brought  about  by 
this  decision  are  easily  implemented:  segment  Bor¬ 
rowers  disa|q)ears;  segment  Titles  is  not  affected;  in 
segment  Copies  signal  AdjustCount  and  function  His¬ 
tory  are  eliminated,  as  are  the  events  that  deal  with  the 
borrowing  and  return  of  copies.  This  list  indicates  also 
the  full  extent  of  the  changes  that  would  have  to  be 
made  in  the  reverse  direction  to  effect  an  enhancement 
that  changes  a  non-lending  library  to  a  lending  library. 

The  most  complicated  changes  take  place  when 
the  host  system  is  modified.  Our  recommendation  is 
that  such  a  modification  be  considered  in  two  parts. 
First,  the  host  change  brings  about  changes  in  the  inter¬ 
face  between  host  and  controller.  Second,  the  interface 
changes  necessitate  changes  in  the  software  system. 
The  difficult  part  is  to  deduce  what  changes  of  the 
second  kind  are  to  arise  from  changes  of  the  first  kind. 
Once  this  has  been  determined,  the  modification  of  the 
controller  follows  the  same  pattern  as  controller 
enhancement 

5.  Case  study:  an  elevator  controller 

Consider  a  system  that  consists  of  a  bank  of  k  elevators. 
There  are  two  modules,  the  elevator  module  and  a 
dispatcher  module.  The  set  of  the  elevator  module  con¬ 
sists  of  the  k  elevators.  One  of  the  functions  is  agenda, 
which  is  set-valued,  and  consists  of  all  the  floors  the 
elevatcx*  is  to  visit  When  a  person  inside  the  elevator 
presses  a  floor  button,  this  floor  is  entered  into  the 
agenda.  When  a  person  arrives  at  the  bank  of  elevators 
and  presses  an  up  or  down  button,  this  is  registered  by 
the  dispatcher.  The  dispatcher  selects  the  elevator  that 
is  to  visit  this  floor,  and  adds  the  floor  to  the  agenda  of 
the  elevator  by  means  of  a  signal.  The  dispatcher  has 
two  further  functions:  when  an  elevator  becomes  idle, 
the  dispatcher  decides  on  a  holding  floor  to  which  this 
elevator  is  to  be  moved,  and  it  reactivates  an  idle  eleva¬ 
tor  when  a  need  for  its  services  arises.  A  full  SF 
specification  of  the  elevator  module  can  be  found  in 
[12]. 

Now  a  change  was  proposed.  Because  there  had 
been  communication  breaks  between  dispatcher  and 
elevators,  people  had  been  waiting  for  elevators  that 


never  came.  The  new  strategy:  when  an  up  or  a  down 
button  is  pressed,  the  floor  is  added  directl)  to  the 
agenda  of  every  elevator,  when  an  elevator  visits  this 
floor,  the  floor  is  taken  off  the  agendas  of  the  other 
elevators.  Even  with  a  dispatcher-elevator  communica¬ 
tion  break  we  have  a  fail-safe  situation  -  elevators  will 
stop  at  some  floors  unnecessarily,  but  nobody  will  be 
forgotten. 

The  required  changes  to  the  specification  of  the 
elevator  module  were  made  in  less  than  an  hour.  The 
following  changes  were  made: 

•  Two  "pickup"  agendas  for  up  and  down  requests 
were  defined  as  extensions  of  the  existing  agenda. 
Let  us  call  them  up-agenda  and  down-agenda, 
respectively.  Events  AddToUp  and  AddToDown 
were  provided  for  making  additions  to  these 
agendas.  However,  in  contrast  to  the  existing 
agenda,  of  which  there  is  one  for  every  elevator, 
there  is  just  one  up-agenda  and  one  down-agenda. 

•  The  signal  from  the  dispatcher  that  added  a  floor 
to  the  agenda  of  an  elevator  is  now  redundant  - 
hence  the  transaction  that  picked  up  the  signal, 
and  the  event  that  made  the  actual  change  to  the 
agenda  were  deleted. 

•  The  pressing  of  an  up  (down)  button  was  inter¬ 
preted  as  the  activation  of  event  AddToUp 
(AddT oDown). 

•  In  event  PassingSensor  one  of  the  "pickup"  agen¬ 
das  is  to  be  consulted  in  addition  to  the  existing 
agenda. 

•  Events  TakeFromUp  and  TakeFromDown  were 
defined.  When  an  elevator  stops  at  floor  k  in 
upward  (downward)  movement,  floor  k  is  taken 
off  the  up-agenda  (down-agenda)  by  event 
TakeFromUp  {TakeFromDown). 

Since  the  up-  and  down-agendas  are  common  to 
all  elevators,  the  dispatcher  does  not  have  to  be  con¬ 
cerned  with  taking  floors  off  age:idas,  so  that  the  change 
has  simplified  the  interactiovi  between  the  dispatcher 
and  the  elevator  modules.  The  simplification  can  be 
taken  even  further.  Instead  of  an  idle  elevator  being 
moved  to  a  holding  floor  and  taken  out  of  the  idle  state, 
an  empty  elevator  just  remains  at  the  floor  at  which  it 
becomes  empty,  and  any  addition  to  the  up-  or  down- 
agenda  reactivates  the  elevator.  There  is  then  no  need 
for  the  dispatcher  module  at  all.  However,  the  simpler 
system  introduces  one  minor  inconvenience.  With  the 
elevator  selection  under  the  conu-ol  of  the  dispatcher, 
precisely  one  elevator  would  have  stopped  at  a  particu¬ 
lar  floor  to  pick  up  riders.  Now  more  than  one  elevator 
may  stop  there  before  this  floor  is  taken  off  the 
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appropriate  agenda.  Further,  to  minimize  the  effect  of 
possible  communication  breaks,  each  elevator  should 
have  its  own  processor,  and  its  own  copy  of  the  elevator 
module  -  this  raises  consistency  problems,  but  the 
maintenance  of  consistency  of  this  distributed  system  is 
not  our  concern  here. 

6.  SF  and  other  specification  languages 

Past  efldrts  at  the  specification  of  software  systems  have 
rarely  managed  to  combine  properties  1  and  2  of  the  list 
of  Section  2.  and  where  they  have  done  so.  their 
domains  of  applicability  have  been  rather  narrow.  They 
certainly  have  not  satisfied  all  ten  requirements.  In  fact, 
there  seems  to  be  a  polarization,  with  a  multitude  of 
diagramming  conventions  at  one  pole,  and  sophisticated 
mathematical  notations  at  the  other.  Let  us  look  now  at 
a  few  specification  methodologies  that  have  found  some 
measure  of  acceptance  in  industry,  particularly  in  Great 
Britain,  and  that  go  beyond  inadequately  interpreted 
diagramming  techniques. 

Jackson  System  Development  [IS],  A  JSD 
specification  makes  use  of  diagrams,  but  the  diagrams 
ate  not  intended  to  do  much  more  than  help  tmve  at  an 
intuitive  understanding  of  a  system.  The  actual 
specification  resembles  a  program.  The  specification 
defines  a  distributed  netwoiic  of  processes  that  commun¬ 
icate  by  message  passing.  However,  the  interpretation 
of  the  meaning  of  JSD  constructs  is  left  largely  to  the 
user’s  intuition,  which  makes  it  difficult  to  tell  the  pre¬ 
cise  nature  of  the  message-passing  mechanism.  A  seri¬ 
ous  flaw  of  JSD  is  its  extreme  process  orientation, 
which  leaves  the  data  aspect  underemphasized.  First, 
this  hinders  modularization.  Second,  when  a  JSD  sys¬ 
tem  refers  to  a  data  base,  the  data  base  is  regarded  as 
being  outside  the  system.  Data  base  constraints  and 
opoations  are  therefore  the  concern  of  a  data  base 
administrator  rather  than  the  developer  of  a  controller. 
In  the  1970s  this  was  considered  an  advantage,  but 
today  we  realize  that  data  base  constraints  are  rarely 
absolutes,  i.e.,  that  they  are  for  the  most  part  determined 
by  the  {xocesses  that  make  use  of  data. 

The  Statechart  visual  formalism  [16,  17].  Sta- 
techarts  are  process  diagrams  with  very  elaborate 
diagramming  conventions  that  allow  a  controller  and  its 
interaction  with  its  host  to  be  described  in  rigorous 
terms.  The  emphasis  is  on  states,  and  on  transitions 
between  states.  Transitions  carry  labels.  On  the 
diagram  the  labels  are  represented  in  a  shorthand  nota¬ 
tion,  but  a  dictionary  contains  them  in  a  fully  expanded 
form,  and  the  expansions  are  written  in  a  formal 
language.  Some  process  designers  may  feel  happy 


working  with  diagrams  supported  by  text,  and  for  them 
Statecharts  are  probably  the  most  appropriate  notation. 
Others  may  feel  happier  working  with  text  supported  by 
diagrams.  They  may  find  the  SF  formalism  more  to 
their  liking. 

The  specification  language  Z  [18].  A  Z  (pro¬ 
nounced  "zed")  specification  consists  of  a  declaration  of 
the  components  of  a  dat'-  base,  an  expression  in  logic 
that  defines  valid  states  of  the  data  base,  and  definitions 
of  operations.  Operations  change  data  base  states. 
Some  background  in  discrete  mathematics  allows  a 
reading  krtowledge  of  Z  to  be  acquired  fairly  easily,  but 
this  does  not  mean  that  Z  specifications  are  always  easy 
to  understand.  The  problem  is  that  the  state  into  which 
an  operation  brings  the  data  base  need  not  be  a  valid 
state,  so  that  additional  implicit  operations  may  be 
needed  to  restore  validity.  Consequently  Z 
specifications  are  mathematically  elegant,  but  are  not 
easy  to  validate  by  potential  system  users  who  often 
lack  the  required  mathematical  sophistication.  More¬ 
over,  Z  lacks  facilities  for  dealing  with  real-time 
embedded  systems.  The  emphasis  that  Z  puu  on  stales 
rather  than  types  implies  that  there  is  no  natural  base  for 
the  modularization  of  large  Z  qrecifications. 

Vienna  Development  Method  (VDM)  [19,  20). 
Originally  developed  for  precise  definition  of  program¬ 
ming  language  constructs,  VDM  has  become  a  widely 
used  specification  language.  In  VDM  data  are  struc¬ 
tured  into  records,  and  these  records  can  serve  as  a  base 
for  modularization.  However,  just  as  with  Z,  there  are 
no  real-time  facilities.  We  regard  SF  as  a  moderniza¬ 
tion  and  extension  of  VDM. 

7.  An  agenda  for  the  future 

We  have  accumulated  considerable  experience  regard¬ 
ing  the  effect  of  major  design  changes  on  SF 
specifications.  The  modular  structure  of  an  SF 
specification,  and  communication  by  message  passing 
have  ensured  that  in  all  cases  the  changes  could  be  car¬ 
ried  out  cleanly  and  rapidly.  There  are  two  major  tasks 
to  be  undertaken.  First,  a  methodology  has  to  be 
developed  for  transforming  an  existing  software  system 
into  an  SF  specification.  This  is  a  very  difficult  task, 
and  the  transformations  will  require  investment  of 
significant  amounts  of  time  by  system  personnel,  but  the 
investment  will  pay  off  in  the  long  run.  As  we  noted  in 
Section  2,  it  is  to  be  regarded  as  preventive  mainte¬ 
nance. 

There  are  also  reliability  considerations.  The  reli¬ 
ability  of  a  software  system  is  the  probability  that  it  will 
not  fail  during  a  given  time  interval  of  execution,  where 
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by  failure  is  meant  a  deviation  from  requirements. 
Unfortunately  this  definition  presents  difficulties.  First, 
failure  of  a  controller  need  not  be  due  to  deviation  from 
requirements  -  the  situation  that  leads  to  the  failure  may 
not  have  been  anticipated.  Second,  a  controller  is  often 
a  reactive  system,  and  execution  time  is  then  not  the 
appropriate  metric  for  the  reliability  compulation.  We 
have  begun  to  address  these  difficulties 

However,  the  main  problem  is  how  to  preserve 
reliability  when  controllers  of  very  high  reliability  are 
being  adapted  to  a  changing  host  and  changing  operat¬ 
ing  conditions.  It  has  bMn  demonstrated  that  the  relia¬ 
bility  of  software  of  ultra-high  reliability  cannot  be 
determined  by  testing  [21].  However,  for  software  that 
has  been  in  the  field  at  multiple  sites  over  a  period  of 
several  years,  the  reliability  can  be  determined  -  for  an 
operating  system  that  supports  a  monitoring  system  cou¬ 
pled  to  a  nuclear  reactor,  a  history  of  failures  gathered 
at  5000  sites  allows  prediction  of  a  future  reliability  of 
0.9  per  1000  years  of  operation  under  two  different  reli¬ 
ability  models  [22,  p.20S].  Such  software  should  be 
adapted  in  a  way  that  preserves  its  known  reliability. 

The  second  major  task  is  translation  of  SF 
specifications  into  programming  languages,  e.g.,  Ada  or 
C.  Although  a  pilot  project  has  shown  that  SF  can  be 
translated  into  C  [23],  it  is  intended  for  prototyping 
alone.  Extensive  teal  time  features  allow  easy  expres¬ 
sion  of  performance  requirements  in  SF,  but  an  SF 
implementation  is  unlikely  to  be  sufficiently  efficient  to 
allow  these  requirements  to  be  satisfied.  We  need, 
therefore,  to  develop  a  methodology  for  making  an  ini¬ 
tial  implementation  (in  Ada,  say)  more  efficient  by 
means  of  source-to-source  transformations. 
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Abstract 

This  paper  discusses  the  difficulties  in  validating 
timing  constraints  of  dynamic  multiprocessor  and  dis¬ 
tributed  systems.  Some  worst-case  bounds  and  effi¬ 
cient  algorithms  now  exist  for  the  special  case  where 
jobs  are  independent.  These  results  are  summarized. 

1  Introduction 

In  a  real-time  system,  many  jobs  are  time-critical; 
their  execution  must  meet  certain  timing  constraints. 
The  term  job  refers  to  a  unit  of  work  to  be  scheduled 
and  executed.  A  job  may  be  the  computation  of  a  con¬ 
trol  law,  the  transmission  of  an  operator  command, 
the  retrieval  of  a  file,  etc.  To  execute,  it  requires  a 
computer,  a  data  link,  a  disk,  respectively;  we  refer  to 
them  all  as  processors.  The  length  of  time  a  job  re¬ 
quires  to  complete  if  it  were  to  execute  alone  is  called 
its  execution  time.  In  the  simplest  form,  the  timing 
constraint  of  a  job  are  specified  in  terms  of  its  release 
time  and  deadline;  the  job  cannot  begin  to  execute 
until  its  release  time  and  must  complete  its  execution 
by  its  deadline.  The  failure  of  a  job  to  complete  by 
its  deadline  is  considered  to  be  a  time  fault,  and  a 
real-time  system  functions  correctly  only  in  the  ab¬ 
sence  of  time  faults.  To  validate  a  real-time  system, 
its  builder  must  be  able  to  demonstrate  convincingly 
not  only  that  the  system  meets  all  of  its  functional 
requirements  but  also  that  every  time-critical  job  in  it 
always  completes  by  the  job’s  deadline. 

“Non- functional,  or  quality,  aspects  of  large  sys¬ 
tems  are  often  treated  in  an  ad  hoc  manner,  even  when 

‘This  work  has  been  partially  supported  by  ONR  Contract 
Nos.  N00014-89-J-1181  and  NOOO-92-J-1815  and  NASA  Grant 
No.  NAG  1-1613. 


they  are  critical  to  the  system’s  ultimate  success.” 
This  statement  by  Salasin  and  Waugh  [1]  is  especially 
true  for  real-time  systems.  Traditionally,  when  build¬ 
ing  or  reengineering  a  real-time  system,  one  first  fo¬ 
cuses  on  its  functional  requirements.  Whether  the 
system  can  meet  its  real-time  requirements  is  checked 
only  after  most  of  the  design  decisions  have  been  made 
and,  often,  after  parts  of  the  system  have  been  imple¬ 
mented.  Timing  constraints  are  validated  by  exhaus¬ 
tive  simulation  or  testing.  This  approach  is  time  con¬ 
suming  and  costly.  To  ensure  that  the  system  can  be 
reliably  tested,  one  is  forced  to  restrict  the  choices  of 
scheduling  strategy,  operating  system,  and  underlying 
system  architecture.  For  this  reason,  modern  schedul¬ 
ing  paradigms  that  lead  to  easy-to-modify/maintain 
systems  are  not  used.  Almost  all  real-time  systems 
that  support  critical  applications  use  clock-driven  or 
cyclic  scheduling  strategies.  Such  a  system  is  brit¬ 
tle,  difficult  to  maintain  and  extend.  Because  a  small 
change  in  the  application  software  or  the  underly¬ 
ing  hardware  and  system  software  can  produce  un¬ 
predictable  timing  effects,  the  system  must  be  tuned 
and  tested  exhaustively  after  every  change. 

This  situation  has  improved  in  recent  years.  There 
are  now  reliable  and  tractable  validation  methods  for 
static  multiprocessor  and  distributed  systems  [2-7]. 
By  static  system,  we  mean  a  system  in  which  jobs 
are  statically  assigned  and  bound  to  processors  and 
are  migrated  among  processors  on  a  relatively  infre¬ 
quently  basis.  Jobs  on  each  processor  are  scheduled 
according  to  a  uniprocessor  scheduling  algorithm.  A 
new  generation  of  analysis  and  validation  tools  built 
on  these  recent  theoretical  advances  are  now  beginning 
to  emerge.  (An  example  is  PERTS  [8].) 
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In  contrast,  efficient  methods  for  validating  dy¬ 
namic  multiprocessor  and  distributed  systems  are  not 
yet  available.  In  a  dynamic  system,  jobs  ready  for 
execution  are  placed  in  a  common  queue  and  are  dis¬ 
patched  and  scheduled  on  available  processors  in  an 
event-driven  manner.  Although  numerous  dynamic 
scheduling  algorithms  are  available,  the  lack  of  effi¬ 
cient,  reliable  and  provably  correct  ways  to  validate 
that  all  deadlines  are  met  in  dynamic  systems  pre¬ 
vents  the  practical  adoptions  of  these  algorithms. 

This  paper  first  gives  an  overview  of  existing  ana¬ 
lytical  and  efficient  methods  for  validating  static  sys¬ 
tems  built  on  well-known  scheduling  algorithms.  It 
then  describes  several  new  worst-case  bounds  and  ef¬ 
ficient  algorithms  for  validating  dynamic  systems  that 
cannot  be  validated  using  existing  methods.  The  spe¬ 
cial  cases  of  the  validation  problem  considered  here 
are  concerned  with  independent  jobs  that  have  ar¬ 
bitrary  release  times,  arbitrary  deadlines,  and  vari¬ 
able  execution  times.  Jobs  are  scheduled  according 
to  a  priority-driven  algorithm.  A  scheduling  algo¬ 
rithm  is  priority-driven  if  it  does  not  leave  any  re¬ 
source  idle  intentionally.  Such  an  algorithm  can  be 
implemented  by  assigning  priorities  to  jobs  and  plac¬ 
ing  all  jobs  ready  for  execution  in  a  queue  ordered 
by  their  priorities.  The  available  processor(s)  is  (are) 
allocated  to  the  job(s)  at  the  head  of  this  queue. 
Priority-driven  algorithms  differ  from  each  other  in 
the  rules  they  use  to  assign  priorities  to  jobs.  Al¬ 
most  all  commonly  used  event-driven  scheduling  al¬ 
gorithms,  such  as  FIFO,  LIFO,  shortest-processing- 
time-first,  earliest-deadline-first,  rate- monotonic,  and 
deadline-monotonic  algorithms  are  priority-driven. 

Following  this  introduction.  Section  2  discusses  the 
difficulties  in  validating  timing  constraints  in  dynamic 
systems  and  gives  a  formal  definition  of  the  validation 
problem  considered  here.  Section  3  gives  an  overview 
of  methods  for  validating  static  systems.  Section  4 
summaries  the  worst-case  bounds  and  efficient  algo¬ 
rithms  for  validating  timing  constraints  of  indepen¬ 
dent  jobs  in  dynamic  systems.  Section  5  discusses  the 
work  that  remains  to  be  done  in  order  to  build  a  com¬ 
prehensive  strategy  for  validating  dynamic  systems. 

2  Validation  Problem 

It  is  well-known  that  a  system  in  which  jobs  are 
scheduled  in  a  priority-driven  manner  may  exhibit 
scheduling  anomalies.  Graham  [9]  has  shown  that  the 


completion  time  of  a  set  of  jobs  can  be  later  when 
more  processors  are  used  to  execute  them  and  when 
jobs  have  shorter  execution  times  and  fewer  dependen¬ 
cies.  When  jobs  have  arbitrary  release  times  and  share 
nonpreemptable  resources,  scheduling  anomalies  can 
occur  even  when  there  is  only  one  processor  and  the 
jobs  are  preemptable.  These  anomalies  make  ensuring 
full  coverage  in  simulation  and  testing  difficult  when¬ 
ever  there  are  variations  in  job  execution  time  and 
resource  requirements  and  jitters  in  job  release  times. 
Unfortunately,  these  variations  are  often  unavoidable. 
Given  an  arbitrary  scheduling  algorithm,  there  is  no 
efficient  way  to  find  the  worst-case  completion  time 
of  each  job.  This  is  why  exhaustive  simulation  and 
testing  are  impractical  and  unreliable  when  used  to 
determine  whether  all  jobs  always  complete  in  time  in 
large  and  dynamic  systems. 

Figure  1  shows  an  illustrative  example.  The  sim¬ 
ple  system  in  this  figure  contains  4  independent  jobs 
and  2  identical  processors.  The  release  time,  deadline 
and  execution  time  of  job  J,-  are  denoted  by  r,,  d,-  and 
e,,  respectively.  These  job  parameters  are  listed  in 
the  table.  In  this  example,  the  execution  times  of  all 
the  jobs  are  known  except  for  J3.  Its  execution  time 
can  be  any  value  in  the  range  [2,6].  The  scheduling 
algorithm  in  this  example  is  preemptive  and  priority- 
driven;  the  priority  order  is  JhJjiJs  and  J4  with  Ji 
having  the  highest  priority.  A  constraint  is  that  jobs 
are  not  migratable.  In  other  words,  once  a  job  begins 
execution  on  a  processor,  it  is  c'  'strained  to  execute 
on  that  processor  until  completion.  We  want  to  vali¬ 
date  that  all  deadlines  can  be  met,  assuming  that  the 
scheduler  works  correctly,  that  is,  it  never  schedules 
any  job  before  its  release  time.  A  naive  way  is  to  sim¬ 
ulate  the  system  twice;  when  the  execution  time  of  J2 
has  the  maximum  value  6  and  when  it  has  the  min¬ 
imum  value  2.  The  results  are  the  schedules  shown 
in  parts  (a)  and  (b)  of  Figure  1.  By  examining  these 
schedules,  we  would  conclude  that  all  jobs  can  con: 
plete  by  their  deadlines.  This  conclusion  is  incorrect 
because  the  simulation  test  does  not  give  us  full  cover¬ 
age.  This  fact  is  illustrated  by  the  schedules  in  parts 

(c)  and  (d).  The  worst-case  schedule  is  shown  in  (c); 
the  completion  time  of  J4  is  21  when  the  execution 
time  of  J2  is  3.  The  best-case  schedule  is  shown  in 

(d) ;  J4  completes  at  time  15  when  the  execution  time 
of  J2  is  5.  To  find  the  schedules  in  (c)  and  (d)  by 
simulating  the  system,  we  need  to  exhaustively  try  all 
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possible  execution  times  of  J3 . 
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Figure  1:  An  example  illustrating  scheduling  anoma¬ 
lies 

Our  objective  is  to  find  analytical  expressions  and 
efficient  algorithms  with  which  we  can  determine  re¬ 
liably  whether  every  job  can  meet  its  deadline.  In 
other  words,  given  a  set  of  jobs,  the  set  of  resources 
available  to  the  jobs,  and  the  scheduling  (and  resource 
access-control)  algorithm  to  allocate  processors  and 
resources  to  jobs,  we  want  to  find  in  polynomial  time 
an  upper  bound  of  the  completion  time  of  every  job. 
Because  jobs  have  different  properties  and  there  are 
different  rules  governing  resource  usage,  this  problem 
has  many  variants.  The  periodic-task  schedulability 
analysis  problem  whose  solutions  are  summarized  in 
Section  3  is  a  variant.  This  section  describes  other 
variants  of  this  problem  that  remain  to  be  solved,  as 
well  as  the  variant  solved  by  the  results  presented  in 
Section  4.  It  also  introduces  the  notations  that  will  be 
used  later. 


Variants  of  the  Validation  Problem 

We  characterize  the  workload  to  be  scheduled  and, 
hence,  analyzed  as  a  set  J  =  {Ji.Jt,  ■  ■  •  ,Jn}  of  jobs. 
Each  job  J,  is  defined  by  its  release  time  r,  >  deadline 
di  and  execution  time  a.  When  there  is  jitter  in  its  re¬ 
lease  time,  r,-  can  have  any  value  in  the  range  [r,“ ,  r,^] 
where  r,~  and  rf  are  the  earliest  release  time  and  the 
latest  release  time  of  7,-,  respectively.  Without  loss  of 
generality,  we  assume  that  rf  >  0  for  all  t,  that  is,  no 
job  is  released  before  f  =  0.  We  say  that  the  jobs  have 
fixed  release  times,  or  there  are  no  jitters,  when  rf  = 
r,-  =  r*  and  that  they  have  identical,  or  zero,  release 
times  when  rj*  =  r*  =0  for  all  1.  The  actual  execu¬ 
tion  time  Ci  is  in  the  range  [e,~,e^]  and  therefore  can 
be  as  small  as  its  minimam  ezecvfton  time  ej*  and  as 
large  as  its  mazimsm  execution  time  e^ .  [r,~,r^]  and 
[ef,e*]  are  given  parameters  of  J,.  7,’s  actual  execu¬ 
tion  time  Ci  may  depend  on  its  input  data,  as  well  as 
the  underlying  hardware  configuration  and  run-time 
environment,  and  may  be  unknown  until  the  job's  ex¬ 
ecution  completes.  Similarly,  the  actual  release  time 
r,-  of  Ji  becomes  known  when  J,  is  released. 

The  jobs  in  J  may  be  dependent;  data  and  control 
dependencies  between  them  impose  precedence  con¬ 
straints  in  the  order  of  their  execution.  A  job  Ji  is 
a  predecessor  of  another  job  Jj  (and  J/  is  a  successor 
of  Ji)  if  Jj  cannot  begin  execution  until  the  execution 
of  Ji  completes.  Two  jobs  Ji  and  Jj  are  tndependenf 
if  they  can  be  executed  in  any  order. 

We  confine  our  attention  here  to  off-line  schedul¬ 
ing.  In  other  words,  the  scheduler  knows  the  param¬ 
eters  Ir,“,r+],  [e,~,et]  and  dj  of  every  job  J,-  before 
any  job  begins  execution.  The  scheduling  algorithm 
is  priority-driven.  It  assigns  lixed  priorities  to  jobs. 
It  may  assign  priorities  to  jobs  based  on  the  known 
job  parameters.  Some  algorithms,  such  as  the  FIFO 
algorithm,  assign  priorities  to  jobs  according  to  their 
actual  release  times.  However,  none  of  the  algorithms 
considered  here  assign  priorities  to  jobs  based  on  their 
actual  execution  times. 

Therefore,  the  given  scheduling  algorithm  is  com¬ 
pletely  defined  by  the  list  of  priorities  it  assigns  to 
the  jobs.  Without  loss  of  generality,  we  assume  that 
the  priorities  of  jobs  are  distinct.  We  will  use  the 
list  (Ji,  J2,  ■  ■■,Jn)  in  decreasing  priority  order  except 
where  it  is  stated  to  be  otherwise.  In  other  words,  we 
always  index  the  jobs  so  that  J,-  has  a  higher  priority 
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than  Jj  if  i  <  j.  Ji  =  {./i, ^2,  •  •  •,  7,}  denotes  the 
subset  of  jobs  with  priorities  equal  to  or  higher  than 
the  priority  of  J,-. 

In  a  dynamic  system  containing  m  identical  proces¬ 
sors.  the  scheduler  maintains  a  common  priority  queue 
and  places  all  jobs  ready  for  execution  in  the  queue. 
There  are  the  following  three  cases: 

(1)  preemptable  and  migratable:  In  this  case,  a  job 
can  be  scheduled  on  any  processor.  It  may  be  pre¬ 
empted  when  a  higher  priority  job  becomes  ready. 
Its  execution  may  resume  on  any  processor. 

(2)  preemptable  and  nonmigratable;  As  in  case  (1), 
each  job  can  begin  its  execution  on  any  processor 
and  is  preemptable.  However,  it  is  constrained 
to  execute  to  completion  on  the  same  processor. 
Figure  1  gives  an  example  of  this  case. 

(3)  nonpreemptable:  Each  job  can  be  scheduled  on 
any  processor.  Some  or  all  of  the  jobs  are  non¬ 
preemptable. 

In  addition  to  processors,  the  system  may  also  have 
a  set  of  serially  reusable  resources.  A  job  may  require 
some  of  these  resources,  as  well  as  a  processor,  in  order 
to  execute.  When  some  of  the  resources  required  by 
two  or  more  jobs  are  the  same,  the  jobs  are  said  to  be 
in  resource  conflict.  We  assume  that  a  resource  access- 
control  protocol  is  used  to  resolve  resource  conflicts 
among  jobs,  and  this  protocol  controls  priority  inver¬ 
sion  and  prevents  deadlock.  Therefore  the  length  of 
time  any  job  Ji  may  be  blocked  from  execution  due  to 
resource  conflict  is  bounded  from  above.  This  bound  is 
called  the  worat-caae  blocking  time  of  7,-  and  is  denoted 
by  bi.  For  a  given  resource  access-control  protocol,  6,- 
is  given  for  every  job  Jj.  We  need  not  be  concerned 
with  the  specific  details  about  the  protocol. 

The  validation  problem  has  four  dimensions.  Two 
dimensions  are  whether  jobs  are  dependent  and 
whether  they  share  any  resource.  Section  4  is  con¬ 
cerned  only  with  independent  jobs  that  do  not  share 
any  resources.  The  other  two  dimensions  of  the  prob¬ 
lem  are  the  release-time  characteristics  of  jobs  and 
rules  in  scheduling.  Each  special  case  based  on  these 
two  dimensions  is  referred  to  by  three  capital  letters 
separated  by  “/”•  The  first  letter  denotes  preempt- 
ability.  It  can  be  either  “P” ,  for  preemptable,  or  “N” , 
for  nonpreemptable.  The  second  letter  defines  the 


migratability  of  jobs.  It  can  be  either  “M”,  for  mi¬ 
gratable,  or  “N” ,  for  nonmigratable.  The  third  letter 
describes  the  release  time  characteristics.  It  can  be 
either  “Z” ,  for  zero  release  times,  or  “F” ,  for  fixed  ar¬ 
bitrary  release  times,  or  “J”,  for  jittered  release  times. 
For  example,  by  jobs  being  P/M/F  (or  P/M/F  jobs), 
we  mean  jobs  that  are  preemptable  and  migratable 
and  have  fixed  arbitrary  release  times.  N/N/Z  jobs  are 
nonpreemptable  (and  therefore  not  migratable)  and 
have  zero,  or  identical,  release  times. 

Definitiona  and  Notations 

Let  denote  the  set  {J* ■  ,Jt) 
which  every  job  has  its  maximum  execution  time.  Sim¬ 
ilarly,  J/~  denotes  the  set  { Jf .  Jj*  ■  *  -  - 1  J,~  }  in  ' 
every  job  has  its  minimum  execution  time.  Wr 
to  the  schedule  of  J,  produced  by  the  given  algoru  >> 
as  the  actual  achedule  Aj  and  the  schedule  of  J*  (or 
Jj~ )  produced  by  the  saune  algorithm  as  the  maximal 
(or  the  minimal)  achedule  Af  (or  A,~)  of  Jj. 

Let  5(Jj)  be  the  instant  of  time  at  which  the  ex¬ 
ecution  of  Jj  begins  according  to  the  actual  schedule 
An.  S{Ji)  is  the  (actual)  atari  time  of  Jj.  Let  5'*'(Jj) 
and  S~{Ji)  be  the  obaervable  atari  times  of  Jj  in  the 
schedules  A^  and  A~ ,  respectively.  Clearly,  for  jol  s 
with  fixed  release  times,  5'*'(Jj)  and  5~(Jj)  can  easily 
be  found  by  constructing  the  maximal  and  minimal 
schedules  and  observing  when  Jj  starts  according  to 
these  schedules.  In  contrast,  5(Jj)  is  unknown  until 
Jj  actually  starts.  Moreover,  because  of  varied  exe¬ 
cution  times,  Jj  may  start  at  different  times  when  we 
repeatedly  simulate  the  system.  We  say  that  the  start 
time  of  Jj  is  predictable  if  5'*’(Jj)  >  5(Jj)  >  5"(Jj). 

Similarly,  let  F'(  Jj)  be  the  instant  at  which  Jj  com¬ 
pletes  execution  according  to  the  actual  schedule  A„. 
F(Jj)  is  the  completion  time  of  Jj.  The  response 
time  of  a  job  is  the  length  of  time  between  its  re¬ 
lease  time  and  its  completion  time.  Let  F'^(Jj)  and 
F~(Jj)  be  the  obaervable  completion  times  of  Jj  ac¬ 
cording  to  the  schedules  A+  and  A~,  respectively. 
The  completion  times  of  Jj  is  said  to  be  predictable  if 
>  nJi)  >  F-iJi). 

We  say  that  the  execution  of  Ji  is  predictable  if  both 
its  start  time  and  completion  time  are  predictable. 
When  the  execution  of  Jj  is  predictable,  the  com¬ 
pletion  time  F'^(Jj)  in  the  schedule  Aj^  minus  the 
minimum  release  time  r,“  of  Jj  gives  Jj’s  worst-case 
response  time.  Jj  meets  its  deadline  if  F'*'(Ji)  <  d,-. 
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Let  Wi(t,  t'),  for  time  instants  (  <  (',  denote  the 
sum  of  execution  times  of  all  the  jobs  in  the  set  Jj 
whose  release  times  are  in  the  interval  [t,  f'].  The 
job  Ji  and  all  jobs  with  higher  priorities  than  it  re¬ 
quire  at  most  t')  additional  units  of  processor 

time  in  the  interval  [t,  t'].  We  call  i')  the  incre* 
mental  (processor)  time  demand  of  J,-  in  the  interval 
[f,  t'].  =  u;i(0,  t)  is,  therefore,  the  total  (pro¬ 

cessor)  time  demand  of  Ji  before  t.  It  is  equal  to  the 
amount  of  processor  time  required  by  all  jobs  that  are 
in  Ji  and  have  release  times  at  or  earlier  than  t.  Sim¬ 
ilarly,  let  wf(t,  t')  (or  w,~(t,  t')  )  be  the  sum  of  the 
maximum  (or  minimum)  execution  times  of  all  jobs 
that  are  in  Ji  and  have  release  times  in  [1,  /'].  Let 
u;+(t)  =  (0,  t)  (or  u;f(<)  =  ti;~(0,  t)).  u>,^(0  (or 

u;^(t))  is  the  maximum  (or  minimam)  time  demand 
of  Ji  before  t.  Clearly,  wf(t)  <  Wi(t)  <  w^(t)  for  all 
i  and  t. 

3  Methods  for  Validating  Static  Sys> 
terns 

Again,  almost  all  existing  analytical  and  efficient 
methods  for  bounding  the  worst-case  completion  times 
are  for  static  systems.  In  this  case,  a  general  strategy 
is  to  first  determine  how  late  each  job  J,-  can  be  de¬ 
layed  from  start  and  completion  by  jobs  that  are  as¬ 
signed  on  the  same  processor  with  it  and,  then,  take 
into  account  of  additional  delays  due  to  synchroniza¬ 
tion  with  jobs  on  all  processors. 

The  best  known  and  the  most  comprehensive  set 
of  bounds  and  algorithms  are  those  based  on  the 
periodic-task  model  [2-7].  In  this  model,  the  set  of 
jobs  assigned  and  executed  on  each  processor  is  parti¬ 
tioned  into  n  subsets,  each  called  a  task.  Some  tasks 
are  periodic;  each  periodic  task  7)  is  a  sequence  of  jobs 
whose  release  times  are  spaced  nominally  at  regular  in¬ 
tervals.  There  may  be  jitters,  but  the  lengths  of  these 
intervals  are  never  less  than  pi,  called  the  period  of 
the  task.  The  release  time  /i  of  the  first  job  in  a  task 
Ti  is  called  its  phase.  The  length  of  time  6i  between 
the  release  time  of  every  job  in  7)  and  its  deadline  is 
called  the  relative  deadline  of  T, .  j,  is  usually  equal 
to  or  less  than  pi.  With  a  slight  abuse  of  the  notation, 
we  use  ef  to  denote  the  maximum  execution  time  of 
each  job  in  7) .  Uj  =  ef  jpi  is  the  maximum  fraction  of 
time  the  jobs  in  7]  use  the  processor  and  is  called  the 
utilization  of  the  task  Ti . 

Some  tasks  are  (periodic)  servers  [4].  A  periodic 


server  is  created  to  handle  the  execution  of  a  stream  of 
jobs  whose  release  times  and  execution  times  are  ran¬ 
dom  variables.  Jobs  handled  by  each  server  are  placed 
in  a  priority  queue.  Whenever  the  server  is  scheduled 
and  allocated  the  processor,  the  job  at  the  head  of  this 
queue  executes.  Each  server  T,  is  characterized  by  its 
period  p,,  execution  time  (budget)  ef,  and  relative 
deadline  6, .  The  scheduler  treats  each  server  as  a  pe¬ 
riodic  task  with  these  parameters.  Therefore  when  we 
try  to  bound  the  completion  times  of  jobs  in  periodic 
tasks,  there  is  no  need  to  treat  the  servers  differently. 

The  precedence  constraints  between  jobs  in  the 
same  task,  if  any,  are  naturally  taken  care  of  by  mak¬ 
ing  the  release  time  of  every  predecessor  job  equal  to 
or  earlier  than  its  successor  jobs  and  by  executing  jobs 
in  the  task  in  the  FIFO  order.  Similarly,  data  and  con¬ 
trol  dependencies  between  jobs  in  different  tasks  can 
be  taken  care  of  by  adjusting  the  phases  of  the  tasks 
so  that  the  deadline  of  every  predecessor  job  is  earlier 
than  the  release  times  of  its  successor  jobs.  In  this 
way,  we  can  ignore  precedence  constraints  and  treat 
all  jobs  as  if  they  are  independent. 

In  our  notation,  most  of  the  jobs  are  P/N/J  jobs; 
they  are  scheduled  preemptively  (and  are  not  mi¬ 
grated).  Their  resource  accesses  are  controlled  by  a 
protocol  (such  as  the  ones  in  [6,  7])  that  ensures  the 
blocking  time  of  every  job  in  7)  due  to  resource  con¬ 
flicts  with  all  jobs  in  the  system  is  never  more  than 
fr,-.  For  such  systems,  there  are  several  sufficient  con¬ 
ditions,  which,  when  satisfied  by  a  task,  allow  us  to 
conclude  that  all  jobs  in  it  always  complete  by  their 
deadlines.  An  example  is  the  inequality 

^«it+6i/p.<i(2‘/‘-l)  (1) 

t=i 

assuming  that  we  index  the  tasks  so  that  pi  <  ps  < 
•  •  •  <  p„ .  When  there  is  only  one  processor  and  tasks 
are  scheduled  on  the  rate-monotonic  basis  (that  is,  the 
shorter  the  period,  the  higher  the  priority)  and  syn¬ 
chronized  according  to  the  priority-ceiling  protocol  [6], 
all  jobs  in  7)  with  6i  =  pi  always  complete  by  their 
deztdlines  as  long  as  (1)  is  satisfied.  Similar  conditions 
are  known  for  many  other  fixed-priority  algorithms 
(which  assign  the  same  priority  to  ail  jobs  in  each  task 
and  schedule  the  jobs  in  the  same  task  in  the  FIFO 
order)  and  for  arbitrary  values  of  6i  less  than  or  equal 
to  Pi.  Bounds  also  exist  for  the  earliest-deadlinc-fi'st 
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algorithm,  when  there  is  only  one  processor  and  a  pro¬ 
tocol  such  as  the  stack-based  protocol  [7]  is  used.  It  is 
straightforward  to  generalize  the  conditions  to  account 
for  the  effec' s  of  nonpreemption  if  some  jobs  are  not 
preemptable.  It  is  also  straightforward  to  use  these 
conditions  to  bound  the  worst-case  completion  times 
of  jobs  in  periodic  job-shops  and  flow-shops  where  each 
job  consists  of  subjobs  which  execute  in  turn  on  two 
or  more  processors  and  have  end-to-end  deadlines  [10]. 

The  known  sufficient  conditions,  such  as  (1),  are 
particularly  robust.  Specifically,  the  values  of  the  pe¬ 
riods  and  worst-case  execution  times  of  jobs  in  tasks 
Ti,T2,  --,Ti  do  not  appear  in  the  left-handed  side 
of  (1),  only  their  utilizations.  Moreover,  (1)  assumes 
that  the  job  being  analyzed  is  released  at  an  instant 
between  which  and  the  deadline  of  the  job  the  total 
processor  time  demand  of  all  higher-priority  jobs  is  the 
largest.  Therefore  if  the  job  were  actually  released  at 
this  instant,  it  would  have  the  largest  response  time. 
Because  of  this  assumption,  the  conclusion  that  a  job 
can  complete  before  its  deadline  based  on  such  a  suffi¬ 
cient  condition  remains  true  no  matter  what  its  actual 
release  time  is. 

However,  a  test  based  on  a  sufficient  condition  like 
(1)  is  sometimes  pessimistic.  If  (1)  fails  to  hold  for  Ti, 
for  example,  its  jobs  may  nevertheless  always  complete 
in  time.  An  algorithm  that  makes  use  of  the  known  pa¬ 
rameters  Pi  and  ef  can  give  a  more  accurate  prediction 
of  the  worst-case  response  times.  Such  algorithms  are 
used  in  PERTS  [8].  They  are  based  on  a  more  exact 
characterization  [5]  of  the  rate-monotonically  sched¬ 
uled  periodic  tasks.  To  determine  whether  any  job  in 
a  task  Ti  can  meet  its  deadline,  the  algorithm  takes 
the  release  time  of  the  job  as  the  time  origin  0  and 
computes  the  maximum  time  demand  w*  (t)  between 
0  and  its  deadline  at  6,-.  To  compute  the  algo¬ 

rithm  uses  as  phases  of  the  other  tasks  the  values  that 
maximize  w^(t)  for  all  t.  (Usually,  the  choices  are 
fk  =  0  for  all  k.)  The  job  completes  for  sure  at  the 
earliest  time  instant  t  when  w^(t)  +  bi  <  t.  In  other 
words,  the  worst-case  completion  time  of  the  job  is 
-f  6,-  after  its  release  time. 

We  note  that  this  approach  of  checking  when  the 
supply  of  time  meets  the  demand  for  time  is  general 
enough  that  it  can  be  generalized  and  applied  to  vali¬ 
date  most  static  systems.  Many  systems  do  not  fit  the 
periodic-task  model;  jobs  have  arbitrary  release  times 
and  precedence  constraints.  Because  jobs  on  each  pro¬ 


cessor  are  scheduled  by  themselves,  a  way  to  ensure 
that  the  scheduler  can  always  enforce  the  given  prece¬ 
dence  constraints  is  to  work  with  the  effective  dead¬ 
lines  and  release  times  of  jobs  rather  than  their  given 
deadlines  and  release  times.  A  job  with  successors 
must  be  completed  before  them.  Hence,  the  effective 
deadline  of  the  job  is  the  earliest  deadline  among  its 
deadline  and  the  deadlines  of  its  successors.  Similarly, 
its  effective  release  time  is  the  latest  time  among  its 
release  time  and  the  release  times  of  its  successors. 
Working  with  effective  release  times  and  deadlines  al¬ 
lows  the  scheduler  to  temporarily  ignore  the  prece¬ 
dence  constraints  between  jobs  and  make  scheduling 
decisions  as  if  the  jobs  are  independent.  To  validate 
whether  every  job  completes  before  its  deadline,  we 
can  use  the  algorithms  for  validating  independent  jobs. 
Unfortunately,  this  method  does  not  work  when  the 
system  is  dynamic.  For  this  reason,  the  results  on  in¬ 
dependent  jobs  presented  in  the  next  section  cannot 
be  readily  extended  to  deal  with  dependent  jobs  in 
dynamic  systems. 

When  the  release  times  of  jobs  are  arbitrary,  rather 
than  periodic,  it  is  slightly  more  complicated  to  de¬ 
termine  the  values  of  release  times  that  maximize  the 
potential  demwd  for  time  in  the  intervad  between  the 
release  time  and  deadline  of  the  job  being  analyzed. 
This  can  be  done  efficiently,  however,  and  Algorithm 
IV MJ  presented  in  Section  4  for  transforming  a  set 
of  jobs  with  jittered  release  times  to  fixed  release  times 
can  be  used  for  this  purpose. 

4  Methods  for  Validating  Dynamic 
Systems 

We  now  focus  our  attention  on  dynamic  multipro¬ 
cessor  systems.  Specifically,  the  performance  bounds 
and  algorithms  presented  are  for  jobs  that  are  indepen¬ 
dent  and  do  not  share  any  resources.  Proofs  of  the¬ 
orems  in  this  section  and  examples  to  illustrate  them 
can  be  found  in  [11]. 

Conditions  for  Predictable  Execution 

It  is  easy  to  find  the  worst-case  and  best-case  com¬ 
pletion  times  of  independent  P/M/F  jobs.  In  partic¬ 
ular,  the  following  theorem  and  corollary  allow  us  to 
conclude  that  the  execution  of  independent  P/M/F 
jobs  is  predictable.  To  find  the  worst-case  (or  best- 
case)  response  time  of  a  job  J,-  in  a  set  J„  of  inde¬ 
pendent  P/M/F  jobs  with  arbitrary  and  fixed  release 
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times,  we  apply  the  given  scheduling  algorithm  on  the 
set  J*  (or  J~  )  where  all  jobs  have  their  maximum  (or 
minimum)  execution  times.  The  response  times  of  Ji 
according  to  the  resultant  schedule  (or  A~ )  is  its 
largest  (or  smallest)  possible  response  time.  We  are 
sure  that  /,■  always  meets  its  deadline  if  it  meets  its 
deadline  in  the  maximal  schedule  A^ . 

Theorem  1  The  atari  time  of  every  job  in  a  set 
of  independent  P/M/F  jobs  is  predictable,  that  ia, 
S+iJi)  >  S(Ji)  >  S-{Ji). 

Corollary  1  The  completion  time  of  every  job  tn  a 
aet  of  independent  P/M/F  joba  ia  predictable,  that  ia, 
F-^iJi)>F{Ji)>F~{Ji). 

When  there  are  jitters  in  release  times,  whether  J,- 
is  schedulable  depends  not  only  on  its  own  release  time 
and  execution  time  but  also  on  the  release  times  and 
execution  times  of  all  the  higher-priority  jobs.  In  try¬ 
ing  to  find  the  worst-case  completion  time  of  a  job, 
we  cannot  simply  choose  the  earliest  or  the  latest  re¬ 
lease  times  of  higher-priority  jobs.  Algorithm  IVMJ 
is  based  on  this  observation.  This  algorithm  tries  to 
find  bounds  of  start  times  and  completion  times  of 
independent  P/M/J  jobs,  by  considering  one  job  at 
a  time,  from  the  job  with  the  highest  priority  to  the 
job  with  the  lowest  priority.  It  transforms  Ji  and  the 
jobs  that  have  priorities  higher  than  Ji  into  jobs  with 
fixed  release  times  as  follows.  Let  Kk  denote  the  job 
transformed  from  Jt  and  Ki  denote  the  set  of  trans¬ 
formed  jobs  {K\,  K2,  •  •  • ,  /fi}.  Algorithm  IP  Ad  ,7  has 
the  following  three  steps: 

(1)  Step  1  computes  the  parameters  of  the  trans¬ 
formed  job  Ki  from  the  parameters  of  J,-.  Specif¬ 
ically,  Ki’s  execution  time  is  equal  to  J/s  max¬ 
imum  execution  time,  e^,  plus  the  length  of  its 
jitter  interval,  {rf  —  r,”).  K/s  release  time  is  J/s 
earliest  release  time,  rj". 

(2)  Step  2  computes  the  parameters  of  Kk  for  each 

of  it  =  1,  —  1.  Kk'a  execution  time  is  cj, 

that  is,  the  maximum  execution  time  of  Jt-  Kt's 
release  time  is  chosen  among  r^,r^  and  rjf  so 
that  the  overlap  between  the  feasible  intervals  of 
Ki  and  Kt  is  as  large  as  possible.  Specifically,  if 
rj^  <  rf  <  Vk  ,  Kk ’s  release  time  is  rf ;  if  rjf  < 
rf ,  Kk’s  release  time  is  (as  late  as  possible); 
and  if  >  r~ ,  Kk ’s  release  time  is  (as  early 
as  possible). 


(3)  Step  3  schedules  Ki  according  to  the  given  pre- 
emptable,  migratable,  priority-driven  algorithm. 
An  upper  bound  of  the  completion  time  F{Ji)  of 
Ji  is  equal  to  the  completion  time  of  A,  in  the 
resultant  schedule  . 

Because  Ki-i  is  not  a  subset  of  Ki,  Ki  needs  to  be 
constructed  for  every  job  Ji .  Consequently,  the  com¬ 
plexity  of  Algorithm  JPAd,/  is  0{n/).  The  following 
theorem  allows  us  to  conclude  that  if  Ki  can  complete 
by  the  deadline  di  of  J,  in  the  schedule  A[  generated 
by  the  IVMJ  algorithm,  then  Ji  is  schedulable  for  all 
possible  combinations  of  release  times  and  execution 
times. 

Theorem  2  The  completion  time  F{Ji)  of  Ji  ia  no 
later  than  the  completion  time  of  the  tranaformed  job 
Ki  in  the  obaervable  achedule  A\  of  Ki  generated  by 
Algorithm  IVMJ . 

In  the  special  case  when  independent  jobs  have 
zero,  or  identical,  release  times  and  jobs  have  fixed 
priorities,  preemption  and  migration  can  never  occur. 
Therefore,  it  does  not  matter  whether  preemption  and 
migration  are  allowed  or  not.  The  following  theorem 
follows  straightforwardly  from  this  observation. 

Theorem  3  The  execution  of  the  independent 
N{P)/N/Z  joba  ia  predictable. 

When  jobs  have  arbitrary  release  times  and  are  not 
migratable,  their  execution  behavior  is  no  longer  pre¬ 
dictable.  This  fact  is  illustrated  by  the  exatmple  in 
Figure  1.  While  the  execution  of  independent  P/N/F 
jobs  is  not  predictable  for  arbitrary  priority  assign¬ 
ments,  it  is  predictable  when  the  jobs  are  scheduled 
in  the  order  in  which  they  are  released.  This  fact  is 
stated  formally  in  the  following  theorem. 

Theorem  4  When  the  prioritiea  of  independent 
P/N/F  joba  are  a.«<tyned  on  the  FIFO  baaia  (that  ia, 
the  earlier  the  releaae  time,  the  higher  the  priority), 
the  execution  of  the  joba  ia  predictable. 

Completion  Times  of  P/N/F  Jobs 

When  independent  P/N/F  jobs  are  not  sched¬ 
uled  on  the  FIFO  basis  and  their  execution  is  no 
longer  predictable,  we  can  bound  their  start  times 
and  completion  times  according  to  the  following  the¬ 
orem.  We  note  that  when  we  want  to  determine 
whether  7,  is  schedulable,  there  is  no  need  to  consider 
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Ji4'3> "  '  I  <^n>  since  jobs  are  preemptable  and  in¬ 
dependent.  Therefore,  we  can  coniine  our  attention  to 
Jt.  The  theorem  is  stated  in  terms  of  the  set  Di\  Di 
is  a  subset  of  Jj  in  which  each  job  is  released  after 
some  job  in  with  a  priority  lower  than  itself  (that 
is,  Jh)  and  is  not  scheduled  to  start  and  complete  be¬ 
fore  Ji  on  the  same  processor  as  Ji  in  the  maximal 
schedule.  In  the  example  in  Figure  1,  D,,  Dj  and  D, 
are  null.  is  {J3}. 

Theorems  S{Ji)  <  5^(Ji)  + 

JkfDi 

JkCDi 

The  worst-case  bound  of  completion  time  given  by 
this  theorem  is  sometimes  too  pessimistic.  For  exam¬ 
ple,  this  theorem  tells  us  that  the  completion  times  of 
the  jobs  in  Figure  1  are  no  greater  than  S,  6, 13  and  24, 
respectively.  The  bound  for  F(J4)  is  pessimistic.  As 
another  example,  we  consider  a  simple  system  con¬ 
taining  m  independent  jobs  and  m  identical  proces¬ 
sors.  The  release  times  of  the  jobs  Ji,  nre 

such  that  rm  <  <  ••  •  <  ri-  The  priority  order 

is  with  Ji  having  the  highest  priority. 

Obviously,  every  job  7,  can  be  scheduled  immediately 
after  its  release  time  and  can  always  complete  at  or  be¬ 
fore  its  observable  worst-case  completion  time  F'^(Ji). 
The  bound  of  the  worst-case  completion  time  of 
computed  from  Theorem  5  can  be  as  large  as  m  times 
the  actual  completion  time  of  Jm,  and  is  therefore  not 
useful. 

Sometimes,  we  can  use  the  information  provided  by 
the  two  observable  schedules  to  dert  'e  more  accurate 
predictions  of  job  completion  times.  For  example,  we 
consider  tests  that  begin  by  examining  the  sequences 
in  which  the  jobs  start  execution  according  to  the  min¬ 
imal  and  maximal  schedules.  Let  (t)  (or  p^  (t))  be 
the  sequence  of  jobs  whose  observable  start  times  are 
at  or  before  t  according  to  the  maximal  schedule 
(or  minimal  schedule  A,~)  of  J,-;  the  jobs  in  the  se¬ 
quence  appear  in  order  of  increasing  start  times.  For 
example,  in  Figure  1,  pJ(5+(J4))  is  (Ji,J2,J3,J<) 
and  p4(5“(J4))  is  (Ji,Jj,J4).  Similarly,  let  p,(t) 
be  the  corresponding  sequence  of  jobs  in  increasing 
order  of  their  actual  start  times  according  to  the  ac¬ 
tual  schedule  A,- ,  including  all  jobs  whose  actual  start 
times  are  at  or  before  t.  We  call  Pi(t)  the  actual  start¬ 
ing  sequence,  and  pf{t)  and  p~(t)  the  maximal  and 


minimal  observable  starting  sequences  according  to 
A*  and  A,~ ,  respectively.  We  say  that  a  sequence  X  is 
a  subsequence  of  a  sequence  V  if  F  contains  X  and  the 
elements  in  both  X  and  Y  appe^a  in  X  and  Y  in  the 
same  order.  In  Figure  1,  {S~ {J4))  =  {Ji,J2,J4)  is 

a  subsequence  of  pJ(S'*'(74))  =  (7i,  Jj,  Js, ^4).  Sim¬ 
ilarly,  P4(5^(7s))  =  (7i, 73,73)  is  a  subsequence  of 
p7(5-(73))  =  (7i,7,,74.73). 

When  we  want  to  determine  whether  7,-  is  schedu- 
lable,  we  first  examine  whether  there  is  preemption  in 
the  maximal  schedule.  In  the  simpler  case;  no  job  in 
Ji  is  preempted  in  the  maximal  schedule  A^.  Then, 
we  examine  whether  the  two  observable  starting  se¬ 
quences  pj*'(5^(7,-))  and  p,~(5~(7i))  are  identical.  If 
the  two  sequences  are  identical,  we  can  conclude  that 
no  job  in  Ji  is  preempted  and  the  orders  in  which 
jobs  start  execution  are  the  same,  according  to  all  the 
schedules  of  Ji  for  all  combinations  of  execution  times 
of  jobs  in  Ji.  Therefore,  the  latest  completion  time  of 
Ji  is  F'*‘(Ji).  This  fact  is  stated  in  Theorem  6. 

Theorem  6  //no  job  in  Ji  is  preempted  ac¬ 
cording  to  Af  and  the  two  observable  starting  se¬ 
quences  pf{S*{Ji))  and  p,“(5~(7j))  are  identical, 
then  F-{Ji)  <  F(Ji)  <  F+(7,). 

Obviously,  the  upper  bound  of  F(7,)  given  by  this 
theorem  is  tight.  For  example,  for  the  system  in  Fig¬ 
ure  1,  no  job  in  7,  is  preempted  according  to  the 
maximal  schedule  A^,  and  P3  (•$''^(73))  is  identical 
to  P3  (5~(73)).  Consequently,  we  can  conclude  that 
the  completion  time  of  J3  is  never  later  than  ^'*'(73), 
which  is  13  according  to  Figure  1  (a).  Similarly,  for 
the  system  consisting  of  m  jobs  that  is  mentioned  ear¬ 
lier,  no  job  in  Jm  >s  preempted  according  to  the  max¬ 
imal  schedule  Aj||,  and  Pm(S'^{Jm))  is  identical  to 
p~(5~(7m))-  Hence  the  completion  time  of  each  job 
7,  is  at  most  equal  to  F'*'(7i). 

A  natural  question  to  ask  at  this  point  is  whether  a 
tight  bound  can  be  derived  in  a  similar  manner  for 
the  case  when  there  is  no  preemption  in  the  max¬ 
imal  schedule  and  Pi’(S~{Ji))  is  a  subsequence  of 
P*iS'*'{Ji)).  Figure  2  illustrates  the  impossibility. 
Parts  (a)  and  (b)  show  the  maximal  and  minimal 
schedules,  respectively.  Part  (c)  shows  a  possible  ac¬ 
tual  schedule.  7i,  73,  J3  and  J4  have  the  same  param¬ 
eters  as  the  ones  in  Figure  1  except  that  the  execu¬ 
tion  time  of  J2  is  in  ^he  range  [1,6]  and  the  release 
times  of  J4  is  2.  The  parameters  of  75,76  and  Jj 
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are  listed  in  the  table  in  Figure  2.  pj(S~(J7))  = 
is  a  subsequence  of  p^{S'*'{J7))  = 

( Ji ,  J3,  J3,  J\,  Je,  Jr),  but  the  actual  starting  sequence 
P7(5(77))  =  (Ji,J7,  Jt,  J3,  Js,  Ji,  Jr),  is  not.  Further- 
iP'^re,  the  set  of  jobs  in  the  actual  starting  sequence 
Pr{S(J7))  is  not  a  subset  of  the  set  of  jobs  in  the 
maximal  starting  sequence  pf(S*'(J7)).  (Jg  is  not  in 
Pr  (S'*"  (J7)  but  is  in  p7(5(  Jt))-)  The  actual  start  time 
of  J7  is  larger  than  the  maximal  start  time  S*(J7),  il¬ 
lustrating  that  the  start  time  is  not  predictable  in  this 
case.  Moreover,  according  to  both  and  Aj  ,  no  job 
is  preempted  before  the  start  time  5(7?)  ot  Jr,  but  J4 
and  Je  are  preempted  before  SiJr)  according  to  the 
actual  schedule  in  Figure  2  (c). 


0  S  10  IS  20 


Figure  2:  An  example  illustrating  unpredictable  start 
times  and  completion  times 

Similarly,  when  some  job(s)  in  Jj  is  preempted  be¬ 


fore  the  completion  time  of  Ji  in  the  maximal  sched¬ 
ule,  the  actual  starting  sequence  may  be  different  from 
the  observable  starting  sequences,  even  though  the  two 
observable  starting  sequences  are  same.  When  some 
job(s)  in  J,  is  preempted  before  the  completion  time  of 
Ji  in  the  maximal  schedule,  the  start  time  and  comple¬ 
tion  time  of  Ji  may  be  unpredictable,  even  though  all 
the  starting  sequences  are  same.  In  the  actual  sched¬ 
ule,  a  job  may  be  preempted  by  a  different  job  from  the 
one  in  the  maximal  schedule,  even  though  all  the  start¬ 
ing  sequences  are  same.  Ebcamples  illustrating  these 
facts  can  be  found  in  [12].  These  examples  lead  us  to 
believe  that  it  is  unlikely  for  us  to  find  tighter  bounds 
than  the  one  given  by  Theorem  5  when  there  is  pre¬ 
emption  in  the  maximal  schedule.  In  many  examples, 
in  fact,  the  completion  times  of  the  jobs  are  accurately 
predicted  by  the  upper  bounds  given  by  Theorem  5. 

Completion  Times  of  N/N/F  Jobs 

We  now  consider  the  case  when  all  jobs  are  nonpre- 
emptable  and  the  release  times  of  all  jobs  are  arbitrary 
but  Axed.  A  lower-priority  nonpreemptable  job  whose 
release  time  is  earlier  tham  7,-  may  be  executed  to  com¬ 
pletion  after  7,-  in  the  observable  schedules  but  before 
Ji  in  the  actual  schedule.  Consequently,  we  cannot 
ignore  such  lower-priority  jobs  when  trying  to  find  the 
start  time  and  the  completion  time  of  Ji.  Let  N  de¬ 
note  the  set  of  nonpreemptable  jobs  and  Ni  denote 
the  subset  of  nonpreemptable  jobs  that  have  release 
times  earlier  than  7.  and  priorities  lower  than  Ji.  Let 
P~  be  a  schedule  of  J~  constructed  according  to  the 
given  scheduling  algorithm,  but  assuming  that  all  the 
jobs  are  preemptable  and  migratable.  Let  Bi  denote 
the  set  of  jobs  in  Ni  which  start  before  Ji  in  P~ . 

It  has  been  shown  [1 1]  that  any  job  7i  not  in  Bi 
cannot  start  before  Ji  in  the  actual  schedule  An-  Al¬ 
gorithm  XAfJJF  makes  use  of  this  fact  to  eliminate 
some  of  the  lower-priority  jobs  in  Ni  from  considera¬ 
tion  when  trying  to  bound  the  worst-case  completion 
times.  It  considers  one  job  at  a  time,  from  the  job  with 
the  highest  priority  to  the  job  with  the  lowest  priority. 
In  order  to  find  the  worst-case  completion  time  of  Ji, 
Algorithm  lA/’A/’T’  transforms  7<  and  jobs  in  J,_i.  In 
this  transformation,  every  job  is  transformed  into  two 
jobs.  Let  Gk  and  Hk  denote  the  two  jobs  transformed 
from  Jk-  Algorithm  IJ/AfP  has  three  steps: 

(1)  In  Step  I  the  parameters  of  the  transformed  jobs 
Gi  and  Hi  are  computed  from  those  of  7,.  Gi’s 
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execution  time  is  equal  to  the  largest  of  the  max¬ 
imum  execution  times  of  jobs  in  B,  ,  if  Bi  is 
nonempty  and  is  equal  to  zero  if  B,-  is  empty.  Gi ’s 
release  time  is  release  time  n  of  Ji .  B,  ’s  execution 
time  is  equal  to  Ji 's  maximum  execution  time 
and  B,’s  release  time  A,  is  r,  plus  Gi’s  execution 
time.  Let  0  be  a  priority  that  is  higher  than  the 
priority  of  Ji.  The  priority  of  G,-  is  0  and  the 
priority  of  B,  is  equal  to  that  of  Ji .  Gi  simulates 
the  job  that  may  block  7,-,  and  Bj  simulates  7,- 
blocked  by  Gi.  Step  1  also  computes  the  parame¬ 
ters  of  Gi  and  B*  for  each  7*  for  i  =  1,  •  •  • ,  i—  1. 
Gi  has  release  time  r^.  Its  execution  time  is  equal 
to  the  largest  of  the  maximum  execution  times  of 
jobs  in  Bi  if  Bt  is  nonempty  and  is  equal  to  zero 
if  Bi  is  empty.  Gt’s  priority  is  equal  to  0.  Hi  is 
a  job  with  jittered  release  time  Aj^]  where  Aj^* 

is  equal  to  ri  and  is  equal  to  n  plus  the  ex¬ 
ecution  time  of  Gi-  Hi's  execution  time  is  equal 
to  e]^  and  its  priority  is  equal  to  that  of  Ji . 

(2)  Step  2  uses  Algorithm  IV MJ  presented  earlier 
and  transforms  each  job  Hi  into  a  job  Li  with 
a  fixed  release  time,  for  A  =  1,  •  •  • ,  i  —  1.  Let  Gi 
and  L,_,  denote  the  sets  of  jobs  {Gi .  •  ■  • ,  Gj}  and 
{X-i,  •  •  ■  ,Li-\),  respectively. 


all  the  jobs  in  G,  as  nonpreemptable,  (2)  L*  in  Li-i 
as  preemptable  (nonpreemptable)  if  7*  is  preemptable 
(nonpreemptable),  and  (3)  Hi  as  preemptable  (non¬ 
preemptable)  if  Ji  is  preemptable  (nonpreemptable). 

Figure  3  illustrates  how  Algorithm  JJJMT  predicts 
the  worst-case  completion  time  of  the  job  J^.  The 
schedules  in  parts  (a),  (b)  and  (c)  are  the  maximal 
schedule,  the  actual  schedule,  and  the  schedule  gener¬ 
ated  by  Algorithm  respectively.  In  the  actual 

schedule,  J\  blocks  73,  and  Jz  delays  7s  since  it  has 
a  higher  priority  than  7s.  In  other  words,  7s ’s  start 
time  is  delayed  because  7^  blocks  73.  In  the  schedule 
generated  by  Algorithm  IJJMT,  the  blocking  of  Jz  by 
J4  is  accounted  for  by  its  blocking  job  Gz  which  has 
release  time  r3  and  execution  time  . 


r,  r,  r,  r,  r. 
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(3)  In  Step  3,  G,  ,  Li-i  and  Hi  are  scheduled  accord¬ 
ing  to  the  given  nonpreemptable,  priority-driven 
algorithm.  An  upper  bound  of  the  completion 
time  F{Ji)  of  7,-  is  equal  to  the  completion  time 
of  Hi  in  the  resultant  schedule. 
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Figure  3:  An  example  illustrating  Algorithm  XJJAfV 


Li-i  is  not  a  subset  of  Li,  Li  needs  to  be  constructed 
for  every  job  7,- .  Consequently,  the  complexity  of  Al¬ 
gorithm  JJJMF  is  O(n^).  The  following  theorem  al¬ 
lows  us  to  conclude  that  if  B,  can  complete  by  the 
deadline  d;  of  Ji  in  the  schedule  generated  by 
algorithm,  then  Ji  always  completes  by  di. 

Theorem  7  The  completion  time  F(Ji)  of  Ji  is  no 
later  than  the  completion  time  of  the  transformed  job 
Hi  in  the  schedule  of  Gi,  Li^^  and  Hi  generated  by 
Algorithm  TJfJJF. 

Because  lower-priority  preemptable  jobs  do  not 
block  any  job,  we  can  use  Algorithm  JAfAfF  with 
very  little  change  to  find  the  completion  times  of  all 
jobs  when  some  jobs  are  preemptable  and  migratable. 
Specifically,  Step  3  of  Algorithm  lA/AfF  treats  (1) 


In  the  case  where  some  jobs  are  nonpreemptable 
and  some  jobs  are  preemptable  but  not  migratable, 
the  actual  start  time  and  completion  time  of  7,  may  be 
postponed  beyond  S'^{Ji)  and  F'^{Ji)  by  two  kinds  of 
jobs.  A  nonpreemptable  lower-priority  job  may  block 
some  jobs  in  the  actual  schedule  but  not  in  the  maxi¬ 
mal  schedule.  Some  higher-priority  jobs  may  preempt 
some  jobs  in  the  actual  schedule  but  not  in  the  maxi¬ 
mal  schedule. 

When  preemptable  jobs  are  not  migratable,  a  lower- 
priority  job  in  Ni  but  not  in  B,  can  start  before  Ji  in 
the  actual  schedule.  We  must  consider  all  the  jobs  in 
Ni  when  trying  to  bound  the  completion  time  of  7,. 
Also,  the  schedule  of  J~  constructed  by  assuming  all 
the  jobs  are  preemptable  and  nonmigratable  gives  us 
no  information  on  which  lower-priority  jobs  can  actu- 
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ally  start  before  Ji.  Algorithm  ZMMT  —  N  can  be 
used  to  bound  the  completion  times  in  this  case.  It 
consists  of  two  steps.  Step  1  considers  the  delays  in 
the  start  time  of  each  job  by  nonpreemptable  lower- 
priority  jobs.  It  uses  Algorithm  to  construct 

a  schedule  for  each  Ji ,  using  the  set  Ni  instead  of  Bi 
in  Step  1  of  Algorithm  XMMT.  Let  f'i’{Ji)  denote 
the  completion  time  of  Hi  in  this  schedule.  Then  it 
computes  the  delays  in  the  completion  time  of  Ji  due 
to  higher-priority  jobs  which  may  preempt  Ji,  or  some 
job  starting  before  Ji,  in  the  actual  schedule.  Step  2 
makes  use  of  Theorem  8,  stated  below.  This  theorem 
is  stated  in  terms  of  the  set  is  a  subset  of  Ji 

in  which  each  job  7^  is  released  after  some  preempt- 
able  job  in  Ji  with  a  priority  lower  than  itself  (that  is, 
Jt).  and  Lfc  (the  job  created  in  Step  2  of  Algorithm 
JMMT)  is  not  scheduled  on  the  same  processor  as 
Hi  to  complete  before  Hi  in  the  schedule  constructed 
by  Algorithm  The  complexity  of  Algorithm 

-  A/"  is  0(n2). 

Theorem  8  F(J,)  <  F+(7i)  +  4  -(ef  - 

ti). 

5  Summary  and  Future  Work 

In  this  paper,  we  have  discussed  some  of  the  chal¬ 
lenges  in  validating  timing  constraints  of  dynamic 
multiprocessor  and  distributed  systems.  In  recent 
years,  many  load-balancing  and  scheduling  algorithm 
that  dynamically  dispatch  and  schedule  jobs  on  avail¬ 
able  processors  have  been  developed.  These  algo¬ 
rithms  often  achieve  better  response  times  and  fuller 
resource  utilization  than  the  traditional  algorithms 
that  statically  assign  and  bind  jobs  to  processors. 
Unfortunately,  analytical  methods  and  efficient  algo¬ 
rithms  for  validating  that  all  jobs  always  complete 
by  their  deadlines  in  dynamic  systems  do  not  exist, 
and  exhaustive  simulation  and  testing  are  unreliable 
and  expensive.  Until  reliable  and  efficient  valida¬ 
tion  methods  become  available,  the  modern  scheduling 
paradigms  cannot  be  used  in  hard  real-time  systems. 

We  have  presented  here  several  worst-case  upper 
bounds  and  efficient  algorithms.  They  can  predict 
reliably  the  worst-case  completion  times  of  indepen¬ 
dent  jobs  in  homogeneous  dynamic  distributed  sys¬ 
tems.  One  of  the  algorithms  allows  us  to  take  into  ac¬ 
count  release  times  jitters.  The  others  2issume  fixed  ar¬ 
bitrary  release  times  but  take  into  account  the  effects 


of  nonpreemptability  and  nonmigratability.  These  re¬ 
sults  constitute  a  small  part  of  the  theoretical  basis 
needed  for  a  comprehensive  validation  strategy  that  is 
capable  of  dealing  with  dynamic  distributed  real-time 
systems.  Much  of  the  work  on  this  problem  remains 
to  be  done.  For  example,  we  must  be  able  to  deal 
with  dependencies  between  jobs.  Ways  to  reliably  pre¬ 
dict  the  worst-case  completion  times  of  jobs  that  have 
precedence  constraints  and/or  share  resources  are  yet 
not  available.  This,  as  well  as  the  work  on  predicting 
the  completion  times  of  jobs  in  heterogeneous  systems 
is  a  part  of  our  future  work. 
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Abstract 

This  paper  describes  a  generic  approach  to  mitigate 
risk  when  reengineering  for  high  throughput  massively 
parallel  systems.  The  approach  entails  baselining  the 
existing  system,  capturing  the  functional  requirements, 
estimating  initial  processing  requirements  through  a  high 
level  analysis,  benchmarking  a  subset  of  the  functionality 
on  a  low  throughput  computer,  and  modeling  the  high 
throughput  application  to  determine  the  detailed 
processing  requirements  for  scaling. 

1:  Introduction 

"Once  the  architecture  begins  to  take  shape,  the  sooner 
contextual  constraints  and  sanity  checks  are  :nade  on 
assumptions  and  requirements,  the  better. " 

Eberiiardt  Rechtin,  Systems  Architecting: 

CreaUng  &  Building  Complex  Systems  111 

Commercial  massively  parallel  processing  (MPP) 
architectures  offer  a  solution  to  TERAFLOP  (one  trillion 
operations  per  second)  computing  applications  in  the 
Navy.  Computing  density  (TCRAFLOP/cubic  foot)  and 
cost  (dollars/TERAFLOP)  have  decreased  in  recent  years; 
however,  the  challenge  of  real-time  embedded  processing 
requirements  pi;ses  a  high  risk  for  complex  systems.  The 
high  risk  relates  to:  inefficient  match  of  applications  to 
architectures,  low  availability  of  high  throughput 
architectures,  the  accuracy  of  forecasted  downward 
spiraling  price  projections,  and  immature  software 
development  tools  (e.g.,  parallelizing  compilers). 

This  paper  proposes  a  generic  approach  to  mitigate  the 
risk  when  investing  in  a  specific  MPP  architecture.  The 
approach  proposes  a  series  of  intermediate  steps  to  assess 
the  compatibility  of  the  architecture  and  the  requirements. 
Each  step  refines  the  assessment  and  leads  to  the  final 
tradeoff  study.  The  approach  embodies  reengineering 
considerations  when  pursuing  new  implementations  for 


cost  or  technology  upgrades.  Specifically,  the  approach 
entails: 

1.  defining  the  system  requirements, 

2.  sizing  an  architecture  using  static  benchmarks, 

3.  allocating  resources  using  systems  engineering 
tools, 

4.  developing  a  full  scale  model, 

5.  validating  the  full  scale  model  with  dynamic 
benchmarks, 

6.  assessing  the  compatibility  of  the  architecture 
with  the  real-time  embedded  applications,  and 

7.  selecting  the  appropriate  design  approach  based 
on  a  trade-off  analysis. 

The  tradeoff  analysis  incorporates  an  assortment  of 
design  tools  to  expedite  and  facilitate  the  decision  making 
process.  Examples  of  tools  used  to  date  include:  VHSIC 
Hardware  Description  Language  (VHDL),  RDD-lOO  and 
OMTool.  The  approach  will  integrate  the  systems 
engineering  tools  developed  under  the  direction  of  the 
Naval  Surface  Warfare  Center  (NSWC)  within  the  Office 
of  Naval  Research's  (ONR)  Engineering  of  Complex 
Systems  (ECS)  Block  Program  as  they  become  available. 

The  generic  approach  suits  a  myriad  of  applications 
ranging  from  radar  to  sonar  systems.  Accordingly,  air, 
surface  and  subsurface  platforms  can  benefit  from  the 
approach  outlined  in  this  paper.  This  paper  uses  the  case 
study  method  to  showcase  the  approach. 

The  case  study  method  applies  the  generic  tqiproach  to 
a  practical  application.  This  paper  discusses  one  case 
study  to  demonstrate  the  utility  of  the  approach.  The 
research  will  exploie  additional  case  studies  as  interest 
arises. 

This  paper  describes  the  case  study,  outlines  the 
generic  systems  engineering  design  approach,  presents 
high  level  architectural  sizing  techniques,  and  discusses 
detailed  modeling  and  requirements  allocation  issues.  The 


304 


Figure  1 .  WAA  Processing 


Wide  Aperture  Array  (WAA)  beamforming  problem 
serves  as  the  initial  case  study  for  the  application  of  the 
generic  approach.  The  approach  will  evolve  based  on  the 
comments,  suggestions  and  progress  of  research  performed 
in  the  ECS  Program  under  the  direction  of  the  Naval 
Surface  Warfare  Center  (NSWQ . 

2:  Overview  of  WAA  case  study 

NUWC  chose  the  WAA  in-board  electronics 
iqrplication  for  two  reasons.  First,  reengineering  the  WAA 
system  contributes  to  the  incremental  insertion  of 
commercial  off-the-shelf  equipment  (COTS)  into 
submarine  warfare  systems  [2].  Second,  a  new 
implementation  for  the  WAA  inboard  electronics  suite 
would  fulfill  the  Navy's  need  to  reduce  cost,  as  well  as 
align  with  the  Director  of  Defense  Research  and 
Development's  (DDR&E)  thrust  areas.  Two  of  DDRE's 
seven  thrust  areas  emphasize  Affordability,  and  Sea 
Control  and  Undersea  Superiority  [3].  Therefore,  the 
reengineering  of  the  existing  WAA  system  responds  to 
changing  cost  and  commercial  technology  requirements. 

This  paper  provides  a  brief  description  of  the  Wide 
Aperture  Array  full  detection  system.  As  interest  arises, 
the  approach  will  incorporate  diversified  case  studies 
based  on  other  existing  Naval  systems. 

The  cost  effective  implementation  of  the  in-board 
electronics  for  a  Wide  Aperture  Array  full  detection 
system  serves  as  the  primary  objective  for  this  case  study. 
The  WAA  system  can  perform  the  detection  function  for  a 
submarine.  Sea  test  data  indicates  the  Wide  Aperture 


Array  system  can  provide  an  attractive  capability.  In 
addition,  current  fiscal  requirements  within  the  Navy  (i.e., 
the  New  Attack  Submarine  Program)  have  created  the  need 
to  significantly  reduce  the  cost  of  the  existing  WAA 
system. 

The  WAA  full  detection  concept  conceived  by 
DePrimo  and  Choinski,  and  published  in  "An  Efficient 
Approach  to  Systems  Evolution  (EASE)"  [4],  offers  a 
cost  effective  way  to  implement  the  inboard  electronics 
with  additional  full  detection  capability.  This 
implementation  serves  as  the  case  study  for  this  paper. 

The  implementation  uses  commercial  massively 
parallel  processing  technology  developed  within  the  High 
Performance  Computing  and  Communications  (HPCC) 
Program  and  made  available  under  the  Director  of  Defense 
Research  and  Development's  (DDR&E)  Defense 
Modernization  Plan. 

Figure  1  illustrates  the  WAA  signal  processing 
specified  for  the  reengineering  process.  The  signal 
processing  includes  additions  to  the  existing  WAA 
system. 

This  paper  proposes  a  generic  approach  to  mitigate  the 
risk  when  investing  in  a  massively  parallel  architecture. 
The  approach  concentrates  on  the  importance  of  constraints 
and  sanity  checks  throughout  the  design  process. 

3:  System  engineering  approach 

Figure  2  embodies  the  intent  of  Rechtin's  heuristic,  as 
quoted  at  the  beginning  of  the  paper,  by  setting  up  a 
process  to  assess  the  compatibility  of  the  architecture  and 
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Figure  2.  Systems  Engineering  Design  Approach 


the  requirements  throughout  the  design  process.  Each  step 
progressively  refines  the  assessment.  The  approach 
includes  reengineering  considerations  when  systems 
engineers  pursue  a  new  implementation  for  cost  or 
technology  upgrade  reasons. 

The  process  starts  with  the  definition  of  functional 
requirements  and  the  selection  of  a  candidate  architecture. 
For  reengineering  problems  like  the  Wide  Aperture  Array, 
the  process  includes  a  step  to  characterize  the  existing 
system.  The  existing  system  characterization  provides  the 
baseline  for  the  tradeoff  analysis. 

An  object  oriented  software  design  follows  the 
functional  specification.  The  inclusion  of  the  object 
oriented  design  step  translates  functional  requirements  to 
objects  suitable  for  software  design.  This  step  will 
determine  if  object  oriented  design  facilitates  software 
portability  and  reuse.  In  practice,  a  systems  designer  could 
bypass  this  step  in  favor  of  functionally  based  software 
design. 

The  sustained  throughput  and  data  rate  estimates 
follow  the  object  oriented  design.  The  throughput  and 
data  rate  estimates  enable  a  preliminary  architectural  sizing 
using  the  performance  data  from  existing  libraries  or  static 
benchmarks.  Static  benchmarks  provide  single  processor 
performance  data  for  metrics  like  efficiency.  Therefore,  the 
architecture  sizing  obtained  at  this  point  allows  for  an 
initial  assessment  of  the  instantaneous  throughput  levels 
quoted  by  manufacturers. 

Given  the  preliminary  architectural  sizing,  the  systems 
designer  can  perform  a  detailed  analysis  of  the  architectural 
requirements  for  the  given  application.  The  detailed 


analysis  consists  of  a  combination  of  modeling,  simulation 
and  dynamic  benchmarking. 

Dynamic  benchmarking  entails  the  implementation  of 
a  processing  subset  on  a  scaled  down  MPP  architecture. 
In  this  manner,  dynamic  benchmarking  reduces  risk.  The 
move  from  using  a  single  processor  to  multiple  processors 
differentiates  dynamic  benchmarking  from  static 
benchmarking.  Dynamic  benchmarking  also  introduces 
partitioning,  input/output  (I/O)  issues,  and  event  driven 
processing  attributes. 

In  addition,  the  dynamic  benchmarks  validate  the 
detailed  architecture  models  simulated  in  this  step.  The 
concept  of  using  modeling,  simulation  and  benchmarking 
for  architecture  validation  was  first  introduced  by  Munoz 
of  the  Naval  Undersea  Warfare  Center  [5].  Figure  3 
elaborates  on  the  allocation  process  identified  in  figure  2. 

After  allocating  the  functions,  the  final  tradeoff 
analysis  uses  a  set  of  previously  defined  metrics  to 
compare  the  performance  of  the  proposed  implementation 
to  the  existing  baseline  system.  The  results  of  the  tradeoff 
analysis  determine  whether  to  accept,  modify  or  eliminate 
the  candidate  architecture. 

3.1:  Metrics 

The  basis  of  the  tradeoff  analysis  rests  with  the 
extraction  and  comparison  of  metrics.  Modeling  and 
simulation  permit  measurement  of  the  metrics  for  the 
proposed  system.  The  measured  data  can  be  compared  to 
the  existing  system  baseline  data. 

Numerous  metrics  have  been  identified  for 
consideration  in  the  tradeoff  analysis  of  MPP 
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Figure  3.  Functional  Requirements  Allocation 


architectures.  Table  I  presents  the  MPP  metrics  and  their 
definitions.  These  metrics  have  been  discussed  in  detail 
by  Lee  [6]  and  the  team  of  Sweetman  and  Mufioz  [7]. 
llie  design  capture  view  metrics  outlined  by  the  ECS 
research  block  can  also  be  added  to  this  general  list. 

4:  High  level  architectural  sizing 

The  high  level  architectural  sizing  provides  the 
preliminary  estimate  for  the  size  and  configuration  of  a 
compatible  architecture.  The  high  level  architectural  sizing 
consists  of  four  parts: 

1.  cr^turing  functional  requirements, 

2.  baselining  the  existing  system, 

3.  generating  an  object  oriented  design, 

4.  establishing  a  preliminary  architectural  sizing 
from  static  benchmarks. 

Once  completed,  these  four  steps  lead  to  a  detailed 
architectural  design  and  development.  Unlike  the  detailed 
design,  the  high  level  architectural  sizing  does  not  address 
software  issues  or  partitioning  of  the  functions. 

4.1:  Functional  requirements  definition 

The  functional  requirements  definition  phase  of  the 
generic  system  engineering  design  approach  results  in 
system  level  specifications  for  the  application.  For  the 
case  study  highlighted  in  the  paper,  a  systems  engineer 


first  documented  the  WAA  functional  requirements 
previously  depicted  in  figure  1. 

A  systems  engineering  design  tool  like  RDD-100  can 
capture  the  functional  requirements  and  facilitate 
traceability  throughout  the  design  process.  Initially,  a 
word  processor  was  used  to  capture  the  WAA 
requirements;  however,  NUWC  will  ^so  use  tools  like 
RDD-IOO. 

RDD-lOO  brings  several  capabilities  to  the  design 
process  including:  requirements  capture,  functional 
behavior  modeling,  full  scale  architecture  modeling, 
resource  allocation,  dynamic  analysis  and  documentation 
of  results.  Other  tools  are  also  available  to  provide  this 
capability. 

4.2:  Baseline  system  characterization 

Ideally,  the  systems  design  engineer  should  baseline 
the  existing  system  using  metrics  necessary  to  complete 
the  tradeoff  analysis.  Under  these  conditions,  the  designer 
completes  the  tradeoff  analysis  by  comparing  the  new  and 
existing  systems  on  equal  footing. 

Unfortunately,  even  the  best  documentation  from  a 
military  system  will  fall  short  of  supplying  all  the 
previously  defined  metrics  for  the  tradeoff  analysis. 
Design  engineers  document  their  work  for  development 
and  not  reengineering  purposes.  Therefore,  the  tradeoff 
analysis  will  embody  comparisons  between  similar  but  not 
equivalent  metrics. 

The  reengineering  process  was  initiated  by  using  the 
design  capture  views  established  by  the  ECS  Block  to 
capture  the  implementation  of  the  existing  WAA 
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Table  I.  Metrics 


1. 

Computation  Bandwidth 

A  description  of  the  frequency  of  operations  per  unit  time  measured  in 
MFLOPS/second. 

2. 

Communication  Bandwidth 

A  description  of  the  I/O  rate  measured  in  MBytes/second. 

3. 

Memory  Bandwidth 

A  measure  of  the  memory  access  requirements  per  unit  time 
represented  by  Bytes/second. 

4. 

FLOPS-I/O  Ratio 

A  ratio  which  compares  the  computation  Ioad(MFLOPS)  to 
the  I/O  (Bytes/sec)  load. 

5. 

Latency-  FLOPS  Product 

A  characterization  of  the  ability  to  support  communications 

requirements  versus  the  computational  bandwidth  requirements  of  a  module  or 

architectural  element 

6. 

Power/Weight/Volume 

Values  used  to  characterize  the  physical  attributes  of  a 

system.  Power  is  characterized  by  Watts,  weight  by  pounds  (Ib)  and  volume  by 
cubic  feet  (ft^). 

7. 

dBAVatts 

A  measure  which  combines  process  gain  (dB),  algorithm 
efficiency,  dB/gate-Hz,  technology  cost,  gate  -  Hz/watts,  architecture  efficiency, 
and  percent  duty  cycle.  An  alternative  is  to  use  noise  recognition  differential 
(NRD)  instead  of  process  gain  for  a  measure  of  sonar  system  performance. 

8. 

Architecture  Diameter 

An  integer  which  represents  the  maximum  number  of 

communication  paths  that  a  message  or  data  may  be  required  to  travel  from 
processor  to  processor. 

9, 

Architecture  Latent^ 

The  maximum  time,  in  seconds,  a  message  takes  to  propagate 
aaoss  the  path  that  determines  architecture  diameter. 

10. 

Processor  Memory  Ratio 

A  ratio  that  captures  the  memory  available  to  an  individual 
processor.  For  local  memory  systems  the  ratio  would  be  the  local  memory  per 
processor.  For  shared  memory  systems  the  ratio  would  be  computed  by  dividing 
the  total  system  memory  by  the  number  of  processors  and  adding  the  amount  of 
local  cache  memory  per  processor. 

11. 

Average  Message  Size 
per  Processor 

A  value  computed  by  dividing  the  total  number  of  message 

bytes  sent  during  the  time  it  takes  to  execute  an  algorithm,  divided  by  the  number 

of  processors. 

12. 

Response  Time 

The  time  in  seconds  that  is  required  to  execute  an  algorithm. 

The  time  begins  when  the  first  processor  starts  executing  and  ends  when  the  last 
processor  stops  executing. 

13. 

Processor  Utilization 

A  percentage  computed  by  dividing  the  sum  of  the  individual 
times  that  the  processors  are  executing  by  the  total  time  it  takes  to  execute  the 
algorithm,  times  the  number  of  processors  in  the  system. 

=  tl  +  t2  + 13  -1- . tn 

14. 

Program  Size 

NT 

The  size  in  bytes  of  the  program. 

15. 

Speed  Up 

A  value  computed  by  dividing  the  response  time  for  an 

algorithm  executing  on  a  single  processor  by  the  response  time  for  an  algorithm 
executing  on  several  nodes  in  a  system. 
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Figure  4.  Processors 


Figures.  Processor/Function 

system.  The  information  presented  in  figures  4-7 
illustrates  the  types  of  data  documented  for  the  existing 
system.  These  figures  represent  a  sample  of  the  data  used 
for  the  baseline  characterization  of  the  Wide  Aperture 
Array  System  case  study. 

Although  the  existing  baseline  uses  a  distributed 
processing  architecture,  some  of  the  experiences  can  be 
carried  over  to  the  massively  parallel  processor 
architecture.  For  example,  since  trackers  do  not  require 
large  amounts  of  throughput,  the  MPP  implementation  for 
trackers  probably  would  not  change  significantly.  Figures 
4-7  present  four  different  views  of  the  Wide  Aperture 
Array  System.  Partitioning  functions  to  resources  has 
become  the  focal  point  for  the  case  study  because  of  its 
significance  in  massively  parallel  array  architectures. 

4J:  Object  oriented  design 

The  object  oriented  software  design  follows  the 
functional  specification,  and  the  existing  system  baseline. 
The  inclusion  of  the  object  oriented  step  translates 
functional  requirements  to  objects  suitable  for  software 
design.  Object  oriented  design  should  facilitate  the  reuse 
and  portability  of  software. 

Once  the  functional  requirements  have  been  designed, 
a  software  engineer  determines  the  set  of  software  objects 
necessary  to  achieve  the  desired  functionality.  An  analysis 


Figures.  Throughput 


Figure?.  Throughput/Function 

takes  place  with  the  assistance  of  an  object  oriented  design 
tool  like  OMTool.  Future  versions  of  products  like 
OMTool  will  perform  the  functional  to  object  oriented 
translation  automatically;  however,  OMTool  cannot 
perform  the  translation  at  this  time. 

OMTool  provides  functional,  object  and  data  flow 
views  for  a  given  application.  In  addition,  the  tool 
produces  C++  code.  This  paper  neither  endorses  nor 
denounces  the  use  of  OMTool.  Engineers  working  on  the 
WAA  case  study  use  OMTool  because  of  the  features 
available  for  the  given  price  range. 

Figure  8  illusti'ates  the  object  oriented  array  system 
design.  Figure  9  expands  the  object  oriented  beamformer 
design.  These  diagrams  represent  a  synopsis  of  the  object 
oriented  design  which  will  be  used  to  reengineer  the  WAA 
system.  Note  that  although  the  object  oriented  design 
started  with  the  WAA  application  in  mind,  the  high  level 
software  suits  any  array  processing  problem  using  a  2 
stage  time  delay  beamformer. 

In  the  future  a  systems  engineer  specifying  the 
functional  level  requirements  could  expedite  the  object 
oriented  design  if  a  link  was  developed  between  tools  like 
OMTool  and  RDD-100.  The  link  could  further  automate 
the  design  process.  In  addition,  the  link  would  also 
ensure  the  consistency  of  requirements  between  object 
oriented  and  system  design  tools. 
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Figure  8.  Object  Oriented  Array  System  Design 


Figure  9.  Object  Oriented  Beamformer  Design 
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4.4:  Instantaneous  Load  Estimation  and  Static 
Benchmarking 


2.  software  partitioning, 

3.  full  scale  modeling,  and 


A  preliminary  sizing  for  the  WAA  case  study 
demonstrates  the  application  of  instantaneous  load 
estimation  and  static  benchmarking.  The  number  of 
floating  point  multiply  and  addition  operations  were 
calculated  for  the  functions  identified  in  figure  I .  The 
sustained  throughput  estimates  in  Table  II  reflect  these 
multiply  and  addition  estimates  coupled  with  input  data 
rates. 

Intel  Corporation  provided  the  efficiency  and  peak 
numbers  in  Table  II  based  on  Paragon  single  processor 
implementations  written  in  Fortran.  The  peak  numbers  do 
not  reflect  scaling  effects  due  to  lA)  and  partitioning.  As  a 
result,  efficiencies  for  a  final  massively  parallel  version 
would  probably  be  lower.  Therefore,  Table  II  presents  the 
results  of  a  static  benchmarking  effort  and  represents  a 
preliminary  sizing  for  the  WAA  processing  problem. 
Basically,  initial  estimates  indicate  the  WAA  processing 
requires  a  massively  parallel  architecture  capable  of 
providing  36  GFLOPS  of  peak  throughput. 

Table  II.  Static  Benchmarking  Load  Estimation  [8] 


Operation 

Sustained 

Efikiencv 

Peak 

Beamforming 

7.50 

32% 

23.44 

FIR  Filters 

0.54 

25% 

2.16 

Complex  FFTs 

2.02 

56% 

3.61 

Cross  PSD 

0.23 

14% 

1.64 

Auto  PSD's 

0.16 

14% 

1.14 

Integrate  Auto  Spectra 

0.01 

14% 

0.07 

Inverse  Complex  FFT 

1.01 

56% 

1.80 

Normalized  XCOR/ 

0.25 

14% 

1.79 

Up  Sample  and 

Interpolate  FIRs 

Total  GFLOPS 

11.72 

35.65 

5:  Detailed  modeling  and  requirements 
allocation 

The  detailed  level  modeling  and  requirements 
allocation  method  provides  a  specific  design  for  a  MPP 
architecture.  The  method  presented  in  this  paper  addresses 
four  issues: 

I.  technology  independence. 


4.  dynamic  benchmarking. 

Technology  independence  means  that  it  is  possible  to 
retarget  the  software.  Partitioning  involves  dividing  the 
processing  into  pieces  which  can  run  on  individual 
processing  elements. 

Massively  parallel  architectures  can  have  an  assorted 
collection  of  heterogeneous  analog  or  digital  processors. 
The  program  that  runs  the  real-time  embedded  system 
typically  can  have  hundreds  of  thousands  of  lines  of 
source  code.  The  system  is  generally  very  complex, 
difficult  to  design,  and  hard  to  maintain. 

Large  combat  systems  historically  use  a  number  of 
heterogeneous  processors  connected  in  a  distributed 
network  structure.  Continuing  this  trend  would  lead  to 
expansive  custom  MPP  architectures. 

Custom  MPP  architectures  have  thousands  of 
processors  connected  as  nodes  in  some  kind  of  network 
structure.  Commercial  processors  like  the  Intel  i860,  Sun 
Sparc,  or  DEC  Alpha  chip  perform  the  processing 
functions  in  the  nodes.  Hypercube,  mesh,  hierarchical  ring, 
or  tree  topologies  form  the  basis  of  the  networks. 

Companies  like  Intel,  Thinking  Machines  and  Kendall 
Square  Research  have  developed  the  MPP  architectures 
into  commercially  available  systems.  These  systems  may 
have  homogeneous  or  heterogeneous  processing  elements. 
The  difference  between  commercial  and  custom  MPP 
architectures  lies  in  the  user  base.  System  engineers 
optimize  custom  architectures  for  one  specific  application 
for  a  limited  market.  Companies  build  commercial 
architectures  as  products  for  a  more  generic  user 
community.  The  commercial  architectures  may  not  fit  a 
particular  application  as  well  as  a  custom  architecture; 
however,  the  commercial  architecture  will  fit  a  broader 
range  of  applications.  Table  EQ  characterizes  custom  and 
ctmuneicial  architectures. 

Table  111.  Custom  Versus  Commercial  MPP 
Architectures 


MPP 

H/W 

SAV 

Architecture 

Cost 

Cost 

Portability 

Custom 

high 

high 

none 

Commercial 

medium 

medium-high 

partial 

If  MPP  markets  develop  successfully,  the  hardware 
cost  of  commercial  MPP  architectures  will  shrink  faster 
than  custom  MPP  architectures.  The  software 
development  costs  of  the  custom  architectures  are 


311 


Figure  10.  Technology  Independent  Application 


prohibitively  high.  Life  cycle  costs  for  custom 
architectures  are  also  high  because  of  lack  of  portability. 
With  appropriate  research  and  development  in  the 
engineering  of  complex  systems,  the  software  for 
conunercial  MPP  architectures  can  achieve  lower  cost 
through  partial  portability.  One  objective  of  the 
Massively  Parallel  System  Design  task  is  to  address 
detailed  level  MPP  software  mtqjping  and  portability. 

Despite  continuing  research  efforts  in  parallel 
processing,  two  challenges  exist  for  MPP  architectures: 

1.  The  MPP  scalability  problem  presents  a  major 
obstacle.  Efficiencies  from  benchmarks  with  large 
(thousands  of  processors)  MPP  architectures 
measure  less  than  10%.  For  vector  processors 
like  the  Cray  supercomputer,  the  efficiencies 
measure  higher  than  10%.  These  inefTiciencies 
create  a  high  incentive  to  increase  the  speed-up  of 
MPP  systems. 

Z  The  MPP  programming  problem  necessitates  a 
significant  up  front  development  effort  for 
partitioning.  Software  engineers  cannot  program 
MPP  architectures  easily.  One  dominating  issue 
relates  to  the  mapping  process.  The  mapping 
process  determines  partitions  and  allocates 
functions  on  MPP  architectures.  The  absence  of 
automated  mapping  tools  requires  software 
engineers  to  manually  complete  the  mapping 
process. 


Scalability  and  partitioning  are  correlated.  Good 
software  partitioning  methods  generally  lead  to  good 
scaling.  Generally,  the  efficiency  and  scalability  increase 
with  effective  software  development  techniques.  Note 
however,  that  this  relationship  is  not  linear  and  is 
algorithm  dependent 

5.1:  Technology  independence 

Technology  independence  presents  a  significant 
hurdle  to  real-time  embedded  MPP  architectures.  Figure 
10  shows  one  approach  for  attaining  technology 
independence.  In  general,  the  objective  and  procedures  are 
similar  to  other  previous  works.  The  uniqueness  lies  in 
the  details  of  the  methodology.  The  method  concentrates 
on  using  commercially  available  tools  whenever  possible. 
Many  of  these  tools  have  graphical  user  interfaces. 

Graphical  interfaces  facilitate  the  use  of  signal  flow 
graphs  for  representing  real-time  embedded  applications. 
Node  labels  represent  computation  loads  in  the  signal  flow 
graph.  Directed  edges  symbolize  data  dependency  in  the 
graph.  Edge  labels  characterizes  the  communication  delay 
of  signals  from  node  to  node.  Simple  FIFOs  between  the 
nodes  can  represent  communication  delays  for  some  target 
architectures. 

This  kind  of  programming  method  uses  block 
diagrams,  large  grained  data  flow  graphs,  and  synchronous 
data  flow  graphs.  The  graphics  facilitates  the  entry  of  a 
digital  signal  processing  (DSP)  application.  The  task 
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software  may  be  written  in  any  procedural  language  so  that 
simulation  of  the  function  can  be  done  on  host  processor 
before  it  is  mapped  into  a  target  MPP  system. 

This  method  meets  the  scalability  and  portable 
software  challenges.  Software  engineers  can  use  one  of 
three  different  techniques  to  program  MPP  architectures. 
The  first  one  takes  a  regular  sequential  program  and 
compiles  it  for  a  MPP  system.  This  technique  is  referred 
to  as  the  parallelizing  compiler  approach.  The  second 
recodes  the  program  in  a  parallel  language  such  as 
LINDA,  FORTRAN  90,  or  functional  (applicative) 
language.  This  technique  is  called  parallel  languages.  The 
first  technique  does  not  require  a  large  effort  when 
rewriting  software.  A  parallelizing  compiler  capable  of 
dealing  with  thousands  of  lines  of  code  simply  does  not 
exist,  and  the  ones  available  for  small  programs  suffer  from 
performance  problems.  The  second  approach  requires  a 
new  culture  for  programmers.  However,  using  parallel 
language  still  falls  short  of  accqitable  performance. 

The  third  approach  follows  the  message  passing 
methodology  which  involves  explicit  parallel  environment 
control.  Hence,  the  third  technique  is  called  message 
passing.  The  programming  takes  place  in  an  environment 
like  PVM  or  EXPRESS  with  utilities  to  handle  parallel 
message  passing.  The  last  technique  requires  some  user 
awareness  of  the  topology  of  the  MPP  architecture,  but  it 
can  achieve  the  highest  scalability  and  efficiency.  Table 
IV  describes  the  software  techniques. 


Table  IV.  MPP  Software  Approaches 


Efiiciencv 

MaoDine 

Parallelizing  Compiler 

Not  Proven 

Automatic 

Parallel  Language 

SO.01% 

Automatic 

Message  Passing 

1%  -  10% 

Manual 

Unfortunately,  automatic  mapping  technology  for 
partitioning  and  allocation  does  not  exist.  Good 
performance  in  programming  MPP  architectures  relies  on 
tedious  manual  mapping  methods. 

Single  Instruction  Multiple  Data  (SIMD)  MPP  and 
the  Multiple  Instruction  Multiple  Data  (MIMD)  MPP 
architectures  further  complicate  the  portability  challenge. 
SIMD  MPPs  encompass  the  connection  machine  and  the 
MASPAR  architectures.  MIMD  MPPs  consist  of  the 
iPSC  860,  CM-5,  DSP-3,  and  Paragon  shown  in  Figure 
10.  This  paper  concentrates  on  MIMD  architectures. 

The  four  salient  features  for  the  portable  massively 
parallel  systems  design  (MPSD)  method  discussed  in  this 
paper  include: 


1.  task  level  function  module  parallelization  (coarse 
grain), 

2.  high  level  procedural  language  and  messaging 
passing, 

3.  portable  software  for  different  MIMD  MPPs,  and 

4.  calibrated  performance  rrtetrics  for  mapping. 

After  graphic  entry  and  host  function  simulation  a 
moping  procedure  is  required  before  the  program  can  run 
on  a  MPP  system.  Figure  10  shows  a  set  of  MPP 
architectures.  MIMD  with  distributed  memory  and 
message  passing  are  our  target  systems.  The  Mapping 
Procedure  consists  of  partition  and  allocation. 

For  MIMD  with  distributed  memory  and  message 
passing,  scheduling  may  be  done  at  the  compile  time. 
Unlike  real-time  scheduling,  compile  time  scheduling  is 
the  most  sU'aightforward  way  to  handle  the  real-time 
requirement  which  dominates  military  applications. 

5,2:  Partitioning 

The  Calibrated  mapping  performance  prediction 
paradigm  (CMPP)  leads  to  detailed  modeling  allocation. 
The  CMPP  paradigm  is  discussed  in  this  section  of  the 
paper. 

Because  of  a  large  and  multidimensional  solution 
space,  heuristic  methods  provide  the  first  pass  solution. 
Therefore,  automation  and  design  aids  would  expedite  a 
broader  search  of  a  good  solution  for  the  total  system 
design. 

Technology  independence  depends  also  on  the  MPP 
mapping  procedure.  Mapping  involves  partitioning  and 
allocating  function  modules  on  the  MPP  architecture.  The 
absence  of  parallelizing  compilers  and  languages  leaves 
only  design  aids  to  ease  the  mapping  process.  The  user 
has  to  couple  the  procedural  modules  with  message 
passing  operations.  The  process  is  slow  when  the  user  has 
to  do  a  manual  mapping  for  all  the  pieces  (thousands),  as 
well  as  run  the  MPP  execution  to  decide  whether  the 
mapping  works.  Figure  1 1  shows  this  mqiping  procedure 
in  detail.  This  cycle  will  be  repeated  to  optimize  each 
performance  metric. 

This  paper  proposes  a  calibrated  mapping 
performance  prediction  paradigm.  Figure  12  illustrates  the 
paradigm.  The  key  idea  concentrates  on  performance 
model  simulation.  Rather  than  do  a  functional  execution 
on  full  scale  MPP  to  collect  dynamic  performance  metric 
data,  the  benchmark  is  collected  from  model  simulation. 
The  full  scale  model  then  provides  estimates  of  the 
architectural  performance  in  terms  of  the  previously 
defined  metrics. 
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Figure  11.  Detailed  Mapping  Procedure 


Figure  12.  Calibrated  Mapping  Performance  Prediction  Paradigm 


5.3:  Full  scale  modeling 

The  performance  model  requires  two  kinds  of 
modules:  the  execution  module  (EXEC),  and  the 
communication  module  (COMM).  The  execution  time 
metric  as  the  initial  focus,  since  execution  time  is  directly 
related  to  the  speed  up  in  MPP  architectures.  EXEC  load, 
EXEC  bandwidth,  COMM  load,  and  COMM  bandwidth 
characterize  these  modules.  The  host  program  estimates 
the  EXEC  load  and  COMM  loads  for  all  the  partitioned 
pieces  of  a  speciflc  mapping.  The  collected  data  become 
model  parameters  to  annotate  the  performance  model  before 
simulation.  Each  new  partition  requires  repetition  of  the 
load  estimation  and  extraction  process.  Any  automation 
that  can  be  added  would  be  desirable. 

EXEC  modules  and  COMM  modules  are  used  to 
build  the  performance  model  with  token  networks.  The 
token  network  handles  multiple  transmitters  like  real 
network  situations.  Presently  the  model  can  only  handle 
Ethernet  simulation.  Construction  of  the  performance 
model  is  done  in  the  graphics  mode.  The  VHDL  feature 
simplifies  the  replication  of  thousands  of  identical 
modules.  The  Calibrated  Mapping  Performance  Prediction 
(CMPP)  paradigm  hides  many  of  the  details  of  message 
passing  so  that  the  designer  can  concentrate  on  the 
partition  and  allocation  problems.  The  right  environment 
enables  replication  the  modules  many  times.  This 
environment  reduces  the  problem  of  scaling  to  thousands 
of  processors. 

The  CMPP  paradigm  discussed  in  this  paper  used  the 
VHDL  environment.  Note  that  VHDL  is  not  used  here  for 
hardware  design;  instead  VHDL  allows  the  designer  to 
construct  the  structure,  simulate  the  performance,  and 
collect  metric  data.  Both  PC's  and  workstations  support 
VHDL  environments  at  low  cost.  VHDL  will  be  available 
for  hardware  and  system  design  for  a  long  time.  In 
addition,  constructs  of  the  VHDL  language  can  replicate 
modules  as  shown  in  Figure  12  in  a  straight  forward 
manner.  VHDL  generic  constructs  also  help  annotate 
model  parameters  before  simulation.  The  manual 
EXEC/COMM  load  estimation  and  extraction  is  a 
disadvantage  of  the  CMPP  paradigm.  An  automatic 
procedure  would  strengthen  the  CMPP  paradigm. 

The  CMPP  paradigm  allows  the  partition  and 
allocation  results  to  be  portable  to  different  types  of 
MPPs.  Remapping  is  necessary  due  to  different  network 
bandwidths,  topologies  and  throughput  rates  in  different 
MPPs.  However  the  CMPP  minimizes  the  portability 
effort  as  much  as  possible.  The  CMPP  paradigm  reduces 
the  effort  needed  to  run  a  real-time  embedded  application 
on  different  MPP  architectures. 


The  FLOPS-I/O  ratio  characterizes  the  proportion  of 
computation  done  versus  communication  (I/O)  required  in 
the  partition.  The  ratio  can  characterize  the  architecture 
element  once  maximum  throughput  requirements  are 
fulfilled.  If  the  peak  FLOPS-I/O  ratio  of  an  architecture 
element  is  less  than  that  of  the  application  module,  it  is 
possible  to  fit  the  application  module  into  the  element.  If 
the  peak  FLOPS-I/O  ratio  of  an  element  is  greater  than  the 
application  module,  the  partition  will  encounter  problems. 

Essentially,  the  FLOPS-I/O  ratio  characterizes 
computational  activities  relative  to  communication 
activities.  With  this  metric,  it  will  be  easier  to  analyze  the 
results  of  different  mapping  processes  by  examining 
granularity.  One  definition  of  fine  grain  tasks  refers  to 
small  FLOPS-I/O  ratios.  Fine  grain  application  modules 
can  only  be  efficiently  accommodated  in  fine  grain 
architecture  elements. 

The  FLOPS-I/O  ratio  metric  makes  it  possible  to  find 
a  common  partition  of  an  application  for  a  set  of  MPPs. 
The  common  partition  usually  can  not  achieve  the  best 
speedup  and  efficiency  in  a  specific  MPP,  but  the  partition 
can  be  accommodated  in  a  number  of  MPPs.  Further 
development  of  the  CMPP  paradigm  will  demonstrate  this 
situation  in  the  future. 

A  collection  of  Sparc  workstations  on  an  Ethernet 
was  used  to  demonstrate  the  CMPP  approach  during  1993, 
since  the  researchers  did  not  have  access  to  a  commercial 
MPP  architecture.  The  researchers  also  used  a  message 
passing  development  environment  called  EXPRESS. 
EXPRESS  addresses  the  portability  challenge  for  the 
CMPP  paradigm. 

The  development  consists  of  three  parts.  First,  the 
EXEC  module  characterizes  a  piece  of  the  execution  that 
occurred  in  the  architecture  element  of  the  MPP.  EXEC 
modules  represent  a  source  that  generates  a  load  token,  a 
feed-through  that  accepts  input  tokens  and  produces 
output  tokens,  or  a  sink  that  consumes  a  load  token.  The 
following  VHDL  parameters  characterize  EXEC  modules: 

INST  =>  unique  module  name 

Unit  =>  1  Kbytes 

Sizejnfo  =>  statistic  size  in  units 

Throughputjnfo  =>  (#/sec)  statistic  throughput  rate 

Latency_info  =>  statistic  delay  (usee.) 

*  Duty_cycIe_info=>(#/sec)  statistic  duty 
cycles 

*  Only  relevant  for  source  EXEC  modules. 

Figure  1 3  shows  a  VHDL  structure  for  the  modules. 
The  left  most  block  depicts  a  source  EXEC  module,  and 
the  right  most  one  a  sink  EXEC  module.  The  INST 
generic  describes  a  unique  name  for  the  module  in  the 
model.  The  size  *unit  characterizes  the  load  of  execution. 
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The  term  "unit"  represents  the  basic  data  size  such  as  in 
bytes  or  Kbytes.  The  throughput  characterizes  the  speed 
of  this  EXEC  module.  The  latency  feature  permits  a  more 
accurate  delay  account.  Duty  cycle  is  relevant  if  the 
EXEC  module  is  a  source  that  generates  periodic  loads. 


Figure  13.  A  Structure  of  EXEC  Modules,  COMM 
Modules,  and  Ethernet 

Figure  13  shows  two  COMM  modules  called  ebiu. 
The  COMM  module  can  receive  or  transmit  to  or  ftom  a 
local  port  The  data  transfer  on  the  global  port  is  also  bi¬ 
directional.  The  following  VHDL  generics  characterize  the 
COMM  module: 

INST  =>  unique  module  name 
bw_unitj)er_sec  =>  unit  size  (byte) 
bandwidthjnfo  ->  statistic  bandwidth  (byte/sec) 
Tx_latency_info  =>  statistic  transmit  latency  delay 
(usee) 

Rx_latency_info  =>  statistic  receive  latency  delay 
(usee) 

Bus.timeoutjnfo  =>  statistic  time-out  (usee) 
Ack_time_info  =>  statistic  acknowledgment  time 
(usee) 

Bandwidth  and  bw_unit_per_sec  characterize  the 
channel  limitation  of  the  bus.  For  the  case  of  Ethernet, 
part  of  the  Ethernet  features  reside  in  the  COMM  module, 
and  the  other  features  like  arbitration  reside  in  the  token 
signal  resolution  function. 

The  VHDL  resolution  function  is  a  special  facility 
available  in  the  VHDL  language  that  handles  multiple 
signal  drivers.  The  signals  in  this  model  are  all  data  token 
types.  A  special  VHDL  resolution  function  is 
implemented  to  model  the  Ethernet. 


Figure  14.  Ebiu  Sub  Module  Structure 


The  ebiu  is  in  turn  built  from  two  sub  modules:  Local 
and  Globalnet.  The  sub  module  structure  is  shown  in 
Figure  14.  The  VHDL  environment  can  build  these 
entities,  module  structures,  and  sub  module  structures 
hierarchically.  Graphics  windows  permit  editing, 
checking,  and  simulation.  The  bottom  level  behavior  of 
the  EXEC  or  COMM  modules  are  written  in  VHDL. 

5.4:  Dynamic  benchmarking 

Dynamic  benchmarking  helps  to  validate  the  full  scale 
models.  The  data  from  the  dynamic  benchmarks  help 
refine  the  simulation  models  to  reduce  the  risk  associated 
with  the  scaling  process.  Due  to  the  lack  of  availability  of 
a  commercial  MPP  architecture  during  1993,  the  Naval 
Postgraduate  School  researchers  explored  the  CMPP 
paradigm  using  Sun  Workstations  connected  by  Ethernet 

One  important  feature  in  the  CMPP  paradigm 
involves  the  calibration  process  for  EXEC/COMM  model 
parameters.  The  calibration  process  requires  dynamic 
benchmarking  for  fine  tuning.  The  EXEC/COMM 
parameters  are  extracted  from  a  functional  program.  The 
results  enable  calibration  of  the  model.  The  calibration 
process  refers  to  the  adjustment  of  parameters  by 
comparing  a  benchmark  from  the  CMPP  prediction  to  the 
parameters  from  the  actual  execution.  The  calibration 
process  ensures  the  validity  of  the  model. 


Figure  15.  Ethernet  Delay  for  Versus  Message  Size 

The  crucial  step  for  the  experiments  developed  during 
1993  was  to  model  and  characterize  the  Ethernet  correctly. 
The  aforementioned  calibration  process  tuned  the  COMM 
modules  (ebiu).  Figure  15  presents  the  actual  message 
delays  and  the  model  predictions.  The  message  size  varied 
from  1  Kbyte  to  32  Kbytes.  The  predicted  and  measured 
data  matched  very  well.  The  model  parameters  that  yielded 
this  prediction  consist  of: 
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bw_unit_per_sec  =>  byte 
bandwithjnfo  =>  48,000 
Tx_latency_info  =>41.280  ms 
Rx_latency_info  =>  10.000  ms 
Ack_time_info  =>  4 1 .280  ms 
bus_timcout_info  =>  10  sec 

In  addition  to  the  Ethernet  modeling,  two 
beamformers  were  coded  and  tested.  One  beamfotmer  used 
a  frequency  domain  algorithm,  and  the  other  a  time  domain 
algorithm.  The  time  domain  algorithm  reflects  the  type 
used  in  the  Wide  Aperture  array  system. 

The  frequency  domain  beamformer  demonstrated  the 
advantage  of  using  MPP  systems.  The  hypothetical 
beamformer  assumed  96  sensors  in  the  system.  Beam 
response  covered  0  to  180  degrees  with  1  degree 
resolution.  A  host  program  in  FORTRAN  77  was  written 
and  checked  with  the  test  data  to  assure  correctness.  The 
mapping  procedures  outlined  in  this  paper  were  used  to 
partition  and  execute  the  application  under  the  parallel 
EXPRESS  environment.  The  metric  plotted  in  figure  16 
is  the  execution  time.  This  mapping  procedure  was 
repeated  for  1.  2,  4.  6,  and  8  architecture  elements  on  the 
network  of  workstations.  The  results  show  that  increasing 
processing  elements  decreases  the  execution  time. 


Figure  16.  Execution  Time  Versus  Number  of 
Processors 

Figure  17  exhibits  the  computational  and 
communication  loads  for  the  frequency  domain 
beamformer.  The  loads  were  estimated  and  extracted  as 
described  in  the  CMPP  paradigm  in  Figure  12.  These 
estimates  represent  the  loads  for  each  processor. 

The  two  main  execution  modules  are:  the  FFT  module 
and  the  Vector-Matrix  product  module.  The  other  modules 


are  executed  in  the  host.  The  diagram  shows  that  the 
largest  execution  load  occurred  in  the  Vector-Matrix 
module.  The  heaviest  traffic  on  the  Ethernet  was  the 
message  shuffle  between  the  FFT  and  Vector-Matrix 
modules.  The  information  in  figure  17  was  accumulated 
using  the  Sun  operating  tcov  command.  COMM  loads 
were  estimated  using  EXPRESS  profilers. 

The  parallel  EXPRESS  environment  can  also  provide 
an  event  profile  which  shows  the  communication 
activities,  and  the  execution  activities  of  the  processors. 

After  the  analysis,  the  next  step  is  to  construct  a 
partition  structure  in  the  VHDL  environment  that 
simulates  the  performance.  A  structure  for  the  8-node 
partition  was  developed.  The  objective  is  to  be  able  to 
predict  performance  such  as  execution  time  shown  in 
figure  16.  Progress  is  ongoing  and  encouraging. 


Execution  Load 
(FLOPS) 
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1,828,052 


5,544,362 


Communication 
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(Bytes) 
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43,008 


23,040 


Figure  17.  EXEC  and  COMM  Loads  for  an  8- 
Processor  Partition 

A  single  panel  of  the  WAA  beamformer  was  also 
programmed  during  1993.  The  WAA  program  includes 
test  data  generation,  time  delay  memory,  1:3  interpolation, 
full  beam  vertical  shading,  and  beam  summation.  The 
program  has  been  tested  and  v«1fied. 

EXPRESS  will  be  used  to  map  the  application  to  the 
Sun  Workstation  environment.  Table  V  reveals 
preliminary  execution  time  data  for  the  WAA  beamformer 
program  on  three  high  speed  computers:  the  Sparc  630MP 
(2  processors),  the  Navy  TAC-3  (HP  900/730),  and  the 
Cray  YMP/EL.  The  Cray  yielded  the  best  execution  time, 
but  the  TAC-3  yielded  the  smallest  execution  code  size. 
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The  TAC-3  is  about  10  times  slower  than  the  Cray 
YMP/EL,  but  requires  25  times  less  code. 


TABLE  V.  Time  Domain  Beamformer  Benchmark 


Architecture 

Execution 
Time  (sec) 

Execution 
Code  (Bvtes) 

Sparc  630MP  (2  processors) 

833.8 

237,568 

TAC-3  (HP9000f730) 

339.6 

32,768 

Cray  YMP/EL  (4  processors) 

35.4 

802,320 

6:  Summary 

This  paper  proposed  a  generic  approach  to  mitigate 
the  risk  when  investing  in  a  speciftc  MPP  architecture. 
The  approach  poposes  steps  to  assess  the  architecture  and 
requirements  compatibility  which  include:  defining  the 
system  requirements,  sizing  architecture  using  static 
benchmarks,  allocating  resources  using  systems 
engineering  tools,  developing  a  full  scale  model, 
validating  the  full  scale  model  with  dynamic  benchmarks, 
assessing  the  compatibility  of  the  architecture  with  the 
real-time  embedded  application,  and  selecting  the 
iq)proptiate  design  iq)proach  based  on  a  trade-off  analysis. 
The  approach  embodies  reengineering  considerations  for 
cost  or  technology  upgrades. 

The  Wide  Aperture  Array  case  study  demonstrated  the 
approach's  usefulness.  This  paper  presented  a  functional 
specification,  existing  system  baseline,  object  oriented 
design  and  a  series  of  benchmarks  for  the  WAA 
a{^lication. 

In  addition  to  the  WAA  case  study,  the  paper  also 
included  detailed  modeling  and  simulation  data  for  a 
frequency  domain  beamformer  implemented  on  several  Sun 
Workstations  networked  with  Ethernet.  This  test 
demonstrated  the  utility  of  the  Calibrated  Mapping 
Performance  Prediction  Paradigm.  The  information 
presented  included  an  explanation  of  the  detailed  VHDL 
models  that  a  researcher  fine  tuned  to  reflect  the  actual 
operation  of  the  network. 

Several  issues  surfaced  during  the  course  of  the 
research  described  in  this  paper.  These  issues  offer 
possibilities  for  extended  research  in  the  future  The 
issues  include: 

1.  compatibility  between  engineering  tools  (e.g., 
RD-lOO,  OMTool,  etc.), 

2  portability  of  object  oriented  design  software, 

3.  technology  independent  software  for  massively 
parallel  architectures. 


4.  scalability  of  massively  parallel  architectures, 

5.  availability  of  commercial  massively  parallel 
architectures,  and 

6.  suitability  of  the  massively  parallel  systems 
design  approach  to  other  case  studies. 

Research  will  continue  to  pursue  the  massively 
parallel  system  design  framework  discussed.  Plans 
encompass  implememation,  modeling,  benchmarking  and 
simulation  of  the  Wide  Aperture  Array  functions.  A 
continued  focus  will  be  placed  on  using  tools  like  VHDL, 
RD-100  and  OMTool,  in  addition  to  integrating  tools 
emerging  from  the  Engineering  of  Complex  Systems 
Block  Program. 
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Abstract 

This  paper  describes  Xinotech  Research’s  knowledge 
based,  metalanguage-based  programming  environment  to 
support  automatic  program  transformation  and  object  ab¬ 
straction  for  forward  and  reverse  engineering.  In  this  envi¬ 
ronment,  both  knowledge  extraction  and  knowledge  ab¬ 
straction  are  metalanguage-based  and  thus  language  inde¬ 
pendent.  The  transformation  engine  is  accessible  through 
the  interactive  syntax-directed  tools  for  program  construc¬ 
tion  or  for  massive  reengineering.  This  transformation 
infrastructure  is  operational  for  Ada,  and  can  be  applied  to 
transform  existing  programs  to  support  object-oriented 
methodologies,  to  port  existing  software  to  new  libraries 
and  platforms,  to  translate  automatically  between  lan¬ 
guages,  to  change  the  meaning  of  programs,  or  to  enforce 
the  semantics  of  applications  or  programming  standards.  It 
also  supports  specification  and  prototyping  languages,  and 
can  be  retargeted  to  other  programming  languages. 

1.  Introduction 

“By  now  it  is  hard  to  imagine  that  any  computer 
professional  has  not  become  aware  of  the  bottleneck  in 
software  development  For  both  commercial  and  govern¬ 
ment  applications,  the  annual  bill  for  software  is  rising  at  a 
rapid  pace.  For  example,  the  U.S.  Department  of  Defense 
(DoD)  spent  over  $3  billion  on  software  in  1980  and  their 
expenses  are  projected  to  grow  to  $30  billion  per  year  by 
19W  (DoD  Annual  report  FY  ’81).  Moreover,  these  costs 
are  only  the  tip  of  the  iceberg,  as  the  impact  of  faulty 
software,  delayed  software,  and  continuing  maintenance 
costs  drive  the  real  costs  even  higher’’  [12]. 

Ten  years  later,  K.A.  Banniuck  confirms  the  above 
prediction.  His  study  estimates  that  software  expenditures 
in  1990  were  over  $185  billion  worldwide  with  approxi¬ 
mately  $90  billion  being  spent  in  the  U.S.,  and  of  that,  $27 
billion  spent  by  the  DoD.  [2] 


“We  might  well  ask,  why  this  phenomenal  growth  in 
the  cost  of  software?  There  are  several  major  reasons.  One 
is  the  fact  that  the  requirements  for  new  software  systems 
are  more  complex  than  ever  before.  \  second  reason  for  the 
rising  cost  of  software  is  the  increased  demand  for  qualified 
software  professionals.  A  third  reason,  is  the  fact  that  our 
software  development  tools  and  methodologies  have  not 
continued  to  dramatically  improve  our  ability  to  develop 
software. 

“...It  has  for  a  long  time  been  recognized  that  one 
fundamental  weakness  of  software  creation  is  the  fact  that 
an  entirely  new  software  system  is  usually  constructed 
'from  scratch’.  This  is  clearly  an  unfortunate  situation,  as 
studies  have  shown  that  much  of  the  code  of  one  system  is 
virtually  identical  to  previously  written  code.  For  example, 
a  study  done  at  the  Missile  Systems  Division  of  Raytheon 
Company  observed  that  40-60  percent  of  actual  program 
code  was  repeated  in  more  than  one  application  [11]. 
Therefore  the  idea  of  reusability  would  seem  to  hold  one 
answer  to  increasing  software  productivity.  And  yet  the 
simple  notion  of  reusability  (i.e.,  code  reusability)  has  been 
considered  by  computer  professionals  over  the  years  but 
has  never  been  entirely  successful.’’  [12] 

Methodologies  for  reusability  must  be  seamlessly  in¬ 
tegrated  into  the  design,  coding,  testing  and  maintenance 
phases  of  the  software  cycle,  which  according  to  E.  Horowitz 
[12],  account  for  87%  of  the  total  life-cycle  effort. 

Any  methodology  that  can  be  proposed,  could  fail  to 
be  implemented  if  it  is  not  supported  by  tools  that  synthe¬ 
size  software  understanding  and  automate  the  transforma¬ 
tion  of  programs  so  that  reusability  and  modernization 
techniques  can  be  applied  automatically  and  on  a  large 
scale.  The  task  of  manual  application  of  evolutionary 
transformations  on  large  software  systems  would  be  found 
overwhelming  and  quickly  discarded.  The  problem  is  that 
in  order  to  create  these  tools,  they  need  to  be  supported  by 
an  environment  infrastructure  with  the  following  proper¬ 
ties: 

1 .  Structural  and  semantic  knowledge  of  the  program- 
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ming  language  (e.g.  Ada). 

2.  Reusable  language  knowledge  that; 

i.  Supports  quick  fabrication  of  a  multitude  of  lan¬ 
guage  tools  outside  the  realm  of  language  transla¬ 
tion,  and 

ii.  Allows  customization  of  the  created  tools. 

3.  Formal  Specifications.  The  production  of  language 
tools  requires  that  the  realization  of  structural  and 
semantic  language  rules  be  available  to  the  tool  de¬ 
signer  so  that  they  can  be  applied  to  implement  par¬ 
ticular  transformations,  measurements  and  analyses, 
infer  or  complement  other  rules,  etc.  For  example,  a 
successful  environment  must  provide  the  ability  to 
specify  new  applications  such  as  transformations  to 
modularize  source  code,  thus  complementing  the  typi¬ 
cal,  hard-coded,  predefined  functions  of  object  code 
generation.  Language  knowledge  reusability  is  best 
supported  by  a  system  where  languages  can  be  defined 
with  formal  specifications,  independent  and  separate 
from  the  application  tools. 

4.  If  forward  and  reverse  engineering  tasks  are  to  be 
unified,  it  is  essential  that  certain  tools  be  interactive. 
This  means  that  the  formal  mechanisms  to  manipulate 
language  structures  must  be  incremental.  Tradition¬ 
ally,  incrementality  has  been  supported  through  do¬ 
main-dependent,  algorithmic  approaches.  For  the  sake 
of  generality,  it  is  desirable  that  incrementality  be 
derived  from  the  semantics  of  the  metalanguage. 
From  these  requirements,  it  is  clear  why  language- 

based  tools  to  automate  the  otherwise  impossible  task  of 
manual  transformation  of  source  code  have  not  proliferated 
and  matured. 


n.  The  Xinotech  Environment 

The  environment  is  designed  to  be  language-indepen¬ 
dent.  Knowledge  extraction  (e.g.  parsing,  creating  abstract 
syntax  trees,  and  deriving  semantic  attributes)  is  expressed 
using  a  formal  notation  called  XML,  the  Xinotech  Meta- 
Language.  Knowledge  abstraction  (the  process  of  recog¬ 
nizing  program  patterns  and  transforming  them  into  higher- 
level  structures)  is  expressed  through  an  XML  component 
called  XPAL,  the  Xinotech  Pattern  Abstraction  Language. 
The  system  can  thus  be  retargeted  for  other  languages  and 
applications  at  a  fraction  of  the  original  cost. 

XPAL  is  designed  to  express  complex  program  pat¬ 
terns  and  to  specify  transformations  of  these  patterns  into 
more  cohesive  higher-level  concepts.  The  alternative  ap¬ 
proach  of  using  an  intermediate  “universal”  language  to 
which  programs  are  first  transformed,  causes  the  unneces¬ 
sary  loss  of  the  original  model  and  still  does  not  provide  the 
means  to  tailor  transformations.  Only  with  a  pattern  lan¬ 


guage  does  the  task  of  specifying  a  vast  evolving  library  of 
patterns  and  their  transformations  become  feasible,  allow¬ 
ing  pattern  specification  to  become  an  application-oriented 
task. 

XPAL  makes  use  of  a  complete  semantic  notation  and 
a  comprehensive  semantic  library.  Because  XPAL  is  a 
component  of  XML,  the  extraction  meta-language,  XPAL 
has  access  to  XSSL  (XML’s  semantic  notation)  as  well  as 
all  of  the  semantic  equations  written  to  properly  define  a 
particular  programming  language.  For  example,  writing 
patterns  that  require  the  use  of  language  scoping  rules  can 
be  done  by  simply  referring  to  the  corresponding  semantic 
equations. 

XPAL  transformations  can  also  be  used  as  the  vehicle 
to  formalize  and  document  the  implicit  relations  needed  to 
abstract  object  oriented  (OO)  models  from  non-OO  pro¬ 
grams. 

The  environment  is  designed  to  support  interactive 
software  development,  including  syntax-directed  construc¬ 
tion,  graphical  abstraction,  and  standards  and  guidelines 
detection  and  enforcement.  All  these  tools  are  built  on  top 
of  the  metalanguage  engine.  Pattern  transformations  are 
available  interactively  through  these  tools’  user  interfaces. 
Transformation  libraries  have  been  developed  to  support 
object  orientation,  conversion  to  Ada  9X  from  Ada  83,  and 
translation  to  Ada  from  CMS-2  and  Jovial. 

m.  XML»  the  Xinotech 
Meta-Language 

The  language-based,  language-independent  infrastruc¬ 
ture  of  the  Xinotech  environment  is  provided  by  the  imple¬ 
mentation  of  XML.  XML  is  a  highly-readable  language  for 
specifying  the  abstract  grammar,  external  syntax  (views) 
and  semantics  of  languages.  XML  is  an  environment  meta¬ 
language,  because  it  supports  the  design,  implementation, 
embedding,  revision  and  evolution  of  the  various  lan¬ 
guages  used  in  a  software  development  environment,  such 
as  specification,  documentation,  design,  programming, 
testing,  and  configuration  languages.  XML  provides  sup¬ 
port  for  quick  language  prototyping,  reusable  language 
descriptions  through  module  decomposition  and  inherit¬ 
ance,  inter-  and  intra-language  transformations,  and  sepa¬ 
ration  of  embedded  and  annotation  languages.  It  provides 
an  open  architecture  for  integration  to  other  traditional 
semantic  analysis  tools  such  as  STARS  ASIS  for  Ada. 

XML  supports  modules  for  the  hierarchical  decompo¬ 
sition  of  languages.  Modules  are  collections  of  related 
symbols.  Modularization  allows  the  language  designer  to 
logically  divide  the  specification  to  enhance  its  readability. 
A  language  specification  can  import  modules  from  other 
XML  specifications.  This  encourages  reusability  when 
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prototyping  new  languages. 

A  construct  is  defined  in  terms  of  its  intrinsic  language 
properties,  such  as  abstract  grammar,  views  (unparsing 
syntax)  and  semantics.  Other  clauses  describe  details  for 
the  environment,  such  as  menus,  placeholders,  etc. 

XSSL,  the  Xinotech  Semantic  Specification  Language, 
is  a  component  of  XML.  XSSL  is  a  general  notation;  it 
supports,  e.g.,  the  expression  of  Ada  scoping  rules,  type 
checking,  data  flow  relations,  and  language  translation. 
XSSL  supports  structured  types  and  generalized  lists,  and 
it  incorporates  efficient  abbreviation  schemes  to  reduce  the 
complexity  of  expressions  due  to  explicit  semantic  flow.  It 
uses  object-oriented  encapsulation  to  achieve  the  reuse  of 
semantic  structures  throughout  multiple  constructs.  XSSL 
supports  incremental  evaluation  as  well  as  the  semantics  of 
inter-compilation-unit  relations. 


IV.  XPAL  and  Pattern  Abstraction 

Pattern  abstraction  is  the  transformation  process  of 
automatically  condensing  or  abstracting  low-level  source 
code  patterns  found  in  existing  software  into  high-level 
program  concepts.  XPAL,  the  Xinotech  Pattern  Abstrac¬ 
tion  Language,  a  declarative,  constraint  language,  is  the 
vehicle  to  express  these  program  patterns  and  their  trans¬ 
formations.  Since  XPAL  is  a  component  of  XML,  the 
Xinotech  Meta-Language,  these  transformations  can  be 
written  for  any  language  specified  with  XML.  Therefore, 
the  entire  mechanism  is  language  independent. 

Pattern  abstraction  is  valuable  because  it  recognizes 
implied  or  concealed  relationships  in  low-level  source 
code  and,  by  representing  them  with  existing  higher-level 
structures,  makes  the  relationships  explicit  and  conceptual, 
and  the  code  more  cohesive  and  less  fragmented.  This 
reduces  the  complexity  of  the  representation  while  increas¬ 
ing  the  expressive  power  of  the  resulting  programs,  thus 
enhancing  its  maintainability,  understandability  and  reus¬ 
ability.  This  process  is  the  inverse  of  top-down  synthesis, 
such  as  program  compilation. 

In  XPAL,  patterns  can  be  specified  in  terms  of  other 
patterns.  Because  XSSL,  acomponent  of  XML,  is  a  general 
notation  for  expressing  the  semantics  of  languages,  pat¬ 
terns  can  use  or  complement  these  semantic  equations.  The 
approach  traditionally  taken  in  designing  reengineering 
environments  is  that  of  providing  some  semantic  capabili¬ 
ties  through  a  limited  set  of  hard-coded  functions.  In  the 
XMl  ♦^amily,  graph  operations,  such  as  transitive  closures 
for  data  and  control  flow,  can  be  specified  on  the  relation¬ 
ships  characterized  by  XSSL  equations.  A  language  this 
comprehensive  makes  pattern  abstraction  very  powerful. 

Advantages  of  having  XPAL  as  a  component  of  XML. 
Because  the  XPAL  notation  is  embedded  within  XML,  it 


has  the  advantages  of  full  access  to  the  abstract  grammar 
and  semantics  of  the  programming  language,  access  to  a 
general  semantic  notation,  the  use  of  XML  extraction 
mechanisms,  such  as  parsing  views,  to  express  tree  patterns 
textually,  and  the  use  of  multiple  views  which  allow  syn¬ 
tactical  transformations  to  be  expressed  in  the  syntax  of  the 
programming  language. 

V.  The  Xinotech  Program  Composer 

The  Composer  is  the  central  application  tool  built  on 
top  of  the  XML  language  infrastructure.  It  is  a  syntax- 
driven,  interactive  semantic  tool  for  tne  design  and  con¬ 
struction  of  programs.  Programs  are  managed  as  abstract 
syntax  trees  (AST),  with  multiple  textual  representations 
or  views.  An  incremental  parser  and  an  incremental  unparser 
provide  the  mappings  between  the  textual  and  the  AST 
representations.  One  of  the  main  areas  of  concern  during 
the  design  of  the  Composer  was  the  functionality  and 
behavior  of  its  incremental  bottom-up  parser.  This  parser 
was  designed  to  support  a  smooth  left-to-right  insertion 
while  providing  full  interactive  language  support  such  as 
automatic  template  generation,  placeholders,  menus,  and 
formatting  while  typing.  The  user  can  select  levels  of 
template  generation  during  insertion.  Templates  are  non- 
intrusive,  since  the  user  can  type  over  to  skip  optional 
clauses.  Text  flies  not  created  with  the  Composer  are 
automatically  imported  the  first  time  they  are  opened. 

Views  can  be  used  to  create  multiple  formatting 
schemes,  or  to  combine  or  isolate  programs  with  embedded 
documentation  and/or  PDL  structures.  The  Composer  sup¬ 
ports  browsing  through  libraries,  and  provides  program 
outlines  from  any  point  in  the  program. 

VI.  The  Graph  Abstractor 

The  Graph  Abstractor  is  an  analysis  and  maintenance 
tool  designed  to  display  XSSL-generated  semantic  rela¬ 
tions.  These  relations  can  be  displayed  graphically  or 
structurally.  The  Graph  Abstractor  is  designed  to  minimize 
the  size  complexity  of  graphs  and  isolate  the  relations  of 
current  relevance. 


Vn.  The  Guideliner 

The  Xinotech  Guideliner  is  an  interactive  program 
analyzer.  It  verifies  adherence  to  programming  guidelines, 
standards  and  metrics,  and  transforms  programs  automati¬ 
cally  to  comply  with  these  guidelines.  These  guidelines  are 
written  using  XPAL.  The  design  goals  of  the  Guideliner 
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were  as  follows: 

1.  To  provide  an  integrated,  incremental  capability  to 
prevent  and/or  detect  and  flag  user-defined  guideline 
deviations  during  interactive  program  construction 
with  the  Composer. 

2.  To  provide  batch  processing  to  obtain  detailed  and 
statistical  reports  regarding  non-compliance  with  user- 
defined  guidelines  and  standards.  This  can  be  useful 
during  the  quality  assurance  phase  of  code  acceptance 
from  contractors. 

3.  To  provide  the  automatic  translation  of  source  code  to 
comply  with  user-defined  guidelines  and  standards. 
This  process  can  be  applied  to  any  source  code,  regard¬ 
less  of  whether  it  was  created  with  the  Composer. 

4.  To  provide  a  wide  range  of  metrics  measurements  that 
can  be  requested  by  the  user  as  part  of  the  guidelines 
and  standards  to  be  analyzed. 

Vm.  Reengineering  Applications 

XPAL  is  a  general  language  for  program  recognition 
and  transformation.  It  can  be  used  to: 

1.  Translate  programs  from  one  language  to  another, 
such  as  CMS-2  or  Jovial  to  Ada. 

2.  Detect  and  correct  violations  of  user-defined  guide¬ 
lines  and  standards. 

3.  Transform  existing  non-OO  programs  into  object- 
oriented  programs. 

4.  Port  existing  programs  from  one  supporting  library  to 
another.  This  helps  automate  migration  to  newer  stan¬ 
dard  libraries,  or  to  different  operating  systems  and 
hardware  platforms.  As  new  libraries  are  created, 
existing  applications  can  be  searched  for  potential 
matches,  so  that  the  application  can  be  modernized  and 
expressed  in  terms  of  the  new  reusable  components. 

5.  Modify  the  meaning  of  programs.  Transformations 
can  be  written  to  modify  existing  programs  so  that  they 
perform  new  functions,  thus  helping  create  new  appli¬ 
cations  from  existing  ones. 

6.  Apply  isolated  transformations  interactively.  XPAL 
libraries  can  be  created  to  generate  bodies  out  of 
package  specifications,  to  split  packages  or  proce¬ 
dures,  improve  module  decomposition,  etc. 

8.1  Language  translation 

Typically,  language  conversion  is  an  abstraction  pro¬ 
cess,  very  much  the  opposite  of  top-down  synthesis  or 
compilation.  This  is  the  case  whenever  the  target  language 
is  a  higher-level  language,  as  in  the  case  of  translating 
CMS-2,  Jovial  or  FORTRAN  to  Ada.  Compilation  tech¬ 
nologies  do  not  lend  themselves  well  to  this  process,  and 
pattern  abstraction  is  highly  desirable  so  that  low-level. 


implicit,  global  relationships  can  be  identified  and  ab¬ 
stracted  into  explicit  higher-level  constructs.  XPAL  was 
designed  to  support  such  abstraction.  These  are  some 
examples  of  XPAL  applications  when  converting  CMS-2 
to  Ada; 

1.  Patterns  can  be  defined  to  map  different  operating- 
system  dependent  multi-tasking  models  in  CMS-2  to 
the  construct-based  tasking  model  in  Ada.  These  trans¬ 
formations  can  be  done  very  effectively  since  they  are 
a  classic  example  of  implied  relationships  made  ex¬ 
plicit  by  the  abstractor.  Patterns  can  be  written  for  the 
following: 

i.  Building  the  task  structures  out  of  CMS-2  mod¬ 
ules  and  entry  point  tables. 

ii.  Building  the  “Message.Center"  task  out  of  the 
specification  of  the  message  broadcasting  table 
for  the  linker. 

iii.  Abstracting  critical  regions  by  localizing  and  en¬ 
capsulating  the  shared  data  into  tasks,  from  the 
fragmented  test-and-set  protected  access  found  in 
CMS-2.  Such  abstraction  supports  code  migration 
towards  an  object-oriented  methodology. 

iv.  Customizing  patterns  to  support  the  direct  transla¬ 
tion  of  CMS-2  library  procedures  for  some  of 
these  functions  (e.g.  critical  regions),  if  they  exist. 

2.  Abstracting  block  structure  such  as  for,  while  and  exit- 
based  closed  loops  from  goto-based  control  flow. 

3.  Creating  procedures  to  modularize  code  or  to  elimi¬ 
nate  unstructured  loops,  and  creating  enumeration 
types  from  sets  of  constants  and  related  variables. 

These  are  some  of  the  advantages  of  XPAL-based 
translation; 

1 .  Fully  Customizable.  This  is  a  requirement  for  the  case 
of  CMS-2  or  Jovial  to  Ada,  since  the  translation  will 
depend  on  the  dialect,  the  executive  in  use,  and  library 
and  other  environment  dependencies,  as  well  as  on  the 
customization  of  the  translated  code  to  Ada  guidelines 
such  as  the  STARS  Ada  Reusability  Guidelines. 

2.  Fully  reusable  during  subsequent  system  evolution. 
Components  developed  for  translation,  since  they  are 
language-independent,  can  be  used  interactively  dur¬ 
ing  continuing  Ada  development  (as  Ada-to-Ada  re¬ 
engineering  tools). 

3.  Powerful  dual  translation  and  development  environ¬ 
ments.  Part  of  the  success  of  the  reverse-engineering 
process  (i.e.  translation)  depends  on  how  well  it  is 
integrated  with  the  forward-engineering  process  (i.e. 
development).  Such  integration  dictates  the  success  of 
the  translation  system  for  interactive  use. 

4.  High-quality  of  the  resulting  code.  By  devising  so¬ 
phisticated  schemes  for  code  abstraction,  the  transla- 
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tor  designer  can  make  more  comprehensive  use  of  the 
features  of  the  target  language  (e.g.  Ada).  This  results 
in  more  condensed  and  readable  code.  By  not  discard¬ 
ing  the  original  implementation  through  a  very-high- 
level  intermediate  language,  this  approach  is  able  to 
)  maintain  comparable  efficiency  levels. 

5.  Predictability.  The  Xinotech  approach,  using  external 
specifications  for  the  translation,  allows  the  user  to 
verify  and  approve  in  anticipation,  the  ways  in  which 
source  language  structures  have  been  chosen  to  be 
translated.  In  a  system  where  the  implementation  was 
discarded,  the  efficiency  of  the  resulting  code  would 
be  completely  unpredictable. 

6.  Life  Cycle  Orientation.  The  XPAL-based  approach 
takes  into  account  the  fact  that  the  translated  system 
will  continue  to  evolve,  so  tailored  patterns  can  pre¬ 
pare  it  for  further  growth,  by  supporting  2 167  A  docu¬ 
mentation  generation  and  traceability  with  the  PDL  of 
choice,  extraction  of  high-level  graphs,  and  compli¬ 
ance  with  user-defined  standards. 

7.  Formally  Specified  Translation.  Another  advantage  of 
using  formal  specifications  is  that  they  provide  a 
highly  modular  and  functional  decomposition  of  the 
translation  system,  resulting  in  an  accessible  mecha¬ 
nism  for  verifying  the  translator’s  reliability. 

8.  Low-risk  development  path.  This  is  the  result  of  two 
factors:  predictability,  and  the  fact  that  this  technology 
is  implemented  progressively,  with  practical  appre¬ 
ciable  beneHts  available  from  day  one.  These  benefits 
continue  to  grow  in  proportion  to  the  resources  in¬ 
vested  in  the  project.  Its  success  can  be  measured  and 
monitored  throughout  the  development  effort. 

8.2  Support  for  Ada  9X  compliance  and  Ada  9X 
phUosophy 

The  Xinotech  transformation  environment  includes  a 
set  of  Ada  9X  transformation  libraries  to  support  Ada  9X 
compliance  as  well  as  Ada  9X  philosophy.  In  turn,  these 
libraries  are  managed  by  the  Guideliner’s  user  interface. 

Support  for  Ada  9X  Compliance.  The  environment 
provides  a  library  of  transformations  to  automatically  trans¬ 
late  the  9X  violations  in  existing  Ada  83  sources  to  the  Ada 
9X  standard.  These  transformations  can  be  applied  interac¬ 
tively  or  in  batch  mode;  the  result  is  compilable  Ada  9X 
code.  This  library  is  used  to  translate  to  9X  for  compliance, 
even  though  the  resulting  code  may  not  be  object  oriented 
(OO)  or  otherwise  embody  Ada  9X  philosophy  in  any  way. 

Support  for  Ada  9X  Philosophy.  An  additional  library 
transforms  9X-compiiant  programs  into  a  model  support¬ 
ing  OO  and  9X  philosophy.  The  OO  Ada  9X  programs 
resulting  from  these  transformations  take  advantage  of  9X- 
specific  features  for  modularization,  object-orientation, 
parallelism  and  synchronization.  Examples; 


1 .  Transforming  a  package  into  a  hierarchy  with  children 
packages.  This  supports  improved  modularization  by 
allowing  the  direct  sharing  of  declarations  among  a 
closely-related  family  of  packages. 

2.  Transforming  Ada  structures  to  support  explicit  Ada 
9X  vectorization.  A  few  of  these  cases  can  be  detected 
automatically.  Conversely,  the  user  is  able  to  invoke 
these  transformations  interactively. 

3.  Transforming  a  synchronization  model  into  one  with 
explicit  protected  records.  In  some  cases,  the  old 
synchronization  model  can  be  derived  from  the  usage 
of  a  particular  library. 

4.  Transforming  record  types  with  variants  to  tagged 
types  with  extensions.  This  transformation  is  requested 
by  the  user  for  a  particular  record  type  with  variants. 
The  particular  record  type  is  analyzml  to  determine  if 
the  transformation  is  possible,  and  if  so,  the  transfor¬ 
mation  is  performed.  This  transformations  takes  ad¬ 
vantage  of  multiple  dispatching  to  enhance  the  read¬ 
ability,  object-orientation,  and  reusability  of  the  code. 
The  simplest  such  case  involves  a  record  with  a  single 
variant  whose  discriminant  is  a  value  of  an  enumera¬ 
tion  type. 

8J  Real-time  prototyping  environments 

XPAL  can  be  applied  to  support  specification  or 
prototyping  languages  such  as  Luqi’s  PSDL.  [19],  [22] 
Besides  providing  an  integrated,  interactive  front-end  for 
PSDL,  ?Q*AL  can  be  used  to  verify  adherence  to  design 
methodologies,  to  synchronize  graphical  with  structured 
editing,  and  to  map  between  specification  and  implemen¬ 
tation  languages. 

8.4  Object  abstraction 

Object  abstraction  is  the  process  of  recognizing  rela¬ 
tionships  in  existing,  non  object-oriented  (OO)  Ada  pro¬ 
grams,  and  transforming  these  programs  into  a  higher- 
level,  object-oriented  architecture  with  reusable  compo¬ 
nents. 

OO  design  methodologies  have  been  in  use  for  some 
time,  and  are  very  useful  in  helping  to  understand  the 
behavior  of  systems  and  relationships  between  compo¬ 
nents  (objects).  It  seems  natural  that  obtaining  an  object- 
oriented  design  view  of  existing  non-OO  source  programs 
through  reverse-engineering  will: 

1 .  Help  us  understand  the  intended  behavior  of  a  system 
and  its  relationships. 

2.  Allow  us  to  capture  this  OO  design  in  an  OO  design 
language  that  can  be  manipulated  textually  or  graphi¬ 
cally  by  design  tools,  thus  making  it  possible  to  use 
forward  engineering  (FE)  technology  to  analyze, 
modify  and  browse  through  the  design. 

3.  Allow  us  to  restructure  or  redesign  the  existing  code  so 
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Fig.  1.  The  package  in  the  left  window  is  automatically  transformed  into  a  private  object  class. 
Transformation  rules  (not  shown)  are  specified  in  XPAL,  the  Xinotech  Pattern  Abstraction  Language. 


that  it  conforms  to  the  OO  recaptured  design. 
Examples  of  XPAL  for  object  abstraction: 

1 .  Transforming  exported  data  objects  into  abstract  data 
types.  Data  objects  will  be  hidden,  and  made  available 
only  through  access  methods  (procedures).  This  in¬ 
cludes  the  automatic  creation  of  initialization  and 
finalization  methods  for  the  data  types. 

2.  Transforming  program  units  into  reusable  blueprints 
(e.g.  generic  units  in  Ada). 

3.  Transforming  sets  of  variables  into  object  classes  by 
hiding  them  in  structured  types  with  access  methods. 

4.  Transforming  variant  record  types  into  a  base  class 
with  subclasses  (e.g.  Ada  9X  tagged  types  with  exten¬ 
sions).  These  transformations  will  take  advantage  of 
multiple  dispatching  to  enhance  readability,  object- 
orientation  and  reusability. 


IX.  Benents 

9.1  Benefits  for  Ada  9X 

This  environment  represents  a  rather  extensive  solu¬ 
tion  for  Ada  reengineering,  because  it  automates  the  evo¬ 
lutionary  migration,  from  the  legacy  systems  written  in  the 
proprietary  languages  of  the  sixties,  towards  the  full,  ob¬ 
ject-based,  design  philosophy  of  Ada  9X.  For  example,  it 
can  be  used  to: 

1.  Translate  CMS-2  or  Jovial  programs  into  Ada. 

2.  Translate  Ada  83  programs  into  Ada  9X. 

3.  Support  the  object-orientation  of  existing  Ada  code, 

according  to  the  philosophy  of  the  new  Ada  9X  fea¬ 
tures,  thus  enhancing  reusability. 

4.  Automate  the  porting  of  existing  Ada  applications  to  new 

Ada  9X  standard  libraries,  thus  enhancing  the  inter¬ 
changeability  of  the  application  components. 
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5.  Automate  transformations  to  change  the  meaning  of 
existing  programs,  thus  supporting  the  adaptation  of 
existing  programs  to  new  applications. 

9.2  General  benefits 

Support  for  All  Languages  in  the  Life  Cycle.  Pattern 
abstraction  can  be  applied  to  all  the  languages  in  the 
software  life  cycle,  from  specification  languages,  to  OO 
design  languages,  to  annotation  languages,  programming 
languages,  etc.  XPAL  may  be  used  to  automate  top-down 
translation  during  program  development,  or  to  abstract 
design  and  specifications  during  reverse  engineering. 

Interactive  Tranrformation  Environment.  Transfor¬ 
mations  can  also  be  applied  interactively  during  program 
construction.  Forward  and  reverse  engineering  are  thus 
integrated  in  a  single  homogeneous  environment. 

Support  for  Multiple  Programming  Languages. 
Through  XML,  the  same  homogeneous  language-based 
environment  is  available  for  many  programming  languages. 
This  is  particularly  attractive  for  translation  between  dia¬ 
lects.  The  Xinotech  environment  can  also  be  instantiated 
(very  cost  effectively),  for  specialized  languages,  such  as 
VHDL  and  database  languages. 

Open  Architectures.  The  existing  Xinotech  environ¬ 
ment  supports  the  client-server  model  of  an  open  heteroge¬ 
neous  architecture  with  a  graphical  user  interface. 

An  Integrated  Environment.  Xinotech’s  approach  was 
to  create  an  integrated  semantic  environment  for  syntax- 
directed  program  construction,  as  well  as  for  analysis  and 
transformation.  Forward  and  reverse  engineering  aie  indis¬ 
tinguishable.  Vast  transformation  libraries  can  be  expressed 
and  customized  with  a  metalanguage  for  pattern  abstrac¬ 
tion. 
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This  paper  presents  issues  in  re-engineering 
including  ,^mi7/ariry,  completeness,  existing  designs,  and 
when  to  re-engineer.  The  issues  are  discussed  and  a 
reverse  engineering  mediod  based  on  natural  language 
descriptions  of  procedural  code  is  described  in  detail. 
These  descriptions  provide  an  appropriate  level  of 
abstraction  of  the  pro^ural  code  and  can  be  used  as  the 
basis  of  object-oriented  forward  engineering  using  any 
Object-Oriented  Analysis  and  Design  methodology. 


Introduction 

The  appeal  of  object-oriented  programming  with  it's 
extensibility,  maintainability,  and  reusability  has  enticed 
many  considering  development  projects.  Organizations 
with  millions  of  lines  of  aging  procedural  code  are  also 
lot^g  into  object-oriented  progranuning.  As  systems 
become  more  complex  and  fogile,  the  strengths  of  the 
object  paradigm  become  more  t^ipealing. 

The  organizations  with  millions  of  lines  of  code  also 
have  millions  of  dollars  invested  in  that  code.  If  they  are 
to  switch  to  the  object  paradigm,  will  they  just  abandon 
the  investment  they  have  made  in  the  proc^ural  code? 
Should  they  just  start  over  and  re-design  the  system  from 
scratch?  If  at  all  possible,  the  organizations  prefer  to 
achieve  the  benefits  of  the  object  paradigm  without 
buying  a  replacement  system  or  building  one  from 
scratch. 

This  paper  presents  issues  in  re-engineering 
procedural  code  to  the  object  paradigm.  Re-engineering 
is  the  process  of  examining  an  existing  system  and 
implementing  that  system  in  some  form,  typically  a  new 
programming  language.  [Chikof90]  This  is  in  contrast  to 
normal  forward  engineering  which  proceeds  fiom 
requirements  to  a  design.  A  specific  re-engineering 
process  may  involve  reverse  engineering  of  the  existing 
system.  Reverse  engineering  is  the  process  of  examining 
an  existing  system  to  1)  find  the  system's  components  and 
their  inter-relationships  and  2)  create  representations  of 
the  system  at  some  higher  level  of  abstraction. 
[ChikofW] 
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There  are  certain  issues  to  consider  before  re¬ 
engineering  a  system.  These  issues  of  familiarity, 
completeness,  existing  designs,  and  when  to  re-engineer 
are  covered  fully  below.  Organizations  have  many 
options  for  re-engineering,  and  discussing  these  issues 
will  direct  an  organization  toward  some  re-engineering 
method.  Our  analysis  shows  most  organizations  should 
choose  a  method  that  reverse  engineers  to  some 
intermediate  abstraction,  and  then  forward  engineers  to 
object-oriented  code. 

This  paper  proposes  a  method  for  the  reverse 
engineering  portion  of  this  process.  Our  method  provides 
a  way  to  reverse  engineer  procedural  code  into  an 
abstract  form  that  can  be  forward  engineered  into  object- 
oriented  code.  It  provides  an  abstraction  that  is 
iq)ptopriate  for  doing  any  form  of  object-oriented  forward 
engineering,  using  any  of  the  current  object-oriented 
methodologies.  The  form  of  our  abstraction  is  natural 
language  descriptions  of  the  services  provided  by  the 
existing  system.  These  natural  language  descriptions  are 
derived  firm  the  procedural  code  in  the  existing  system 
as  described  below. 

Before  describing  our  method,  we  look  at  the 
previous  research  in  this  area  and  the  rationale  for  using 
natural  language  descriptions.  After  the  method  is 
presented  in  detail,  we  discuss  conclusions  and  future 
research. 

Previous  Research 

Much  work  has  already  been  done  in  the  field  of 
Object-Oriented  Analysis  (OOA)  and  Object-Oriented 
Design  (OOD).  Booch  |Booch91],  Coad  and  Yourdon 
[Coad91],  Jacobson  [Jacob92],  and  Wirfs-Brock 
[Wirfs90]  (to  name  a  few)  present  methods  for  building 
00  designs  fiom  requirements  specifications.  These 
methods  rely  solely  on  the  00  designer's  ability  to 
extract  objects  fiom  the  requirements  specifications. 
These  methods  work  for  systems  that  are  built  fiom 
scratch,  implementing  specific  requirements.  They  do 
not  work  for  systems  that  are  being  re-engineered  from 
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procedural  code.  i.e.,  using  these  methods,  one  cannot 
extract  objects  directly  from  procedural  code. 

Other  methods  (Alabiso88,  Bailin89]  have  combined 
some  sort  of  structured  analysis  and  OOD.  They  provide 
mappings  from  data  flow  diagrams  (DFDs)  and  data 
dictionaries  to  objects  and  classes.  The  methods  rely  on 
the  CO  designei's  ability  to  extract  the  objects  from  the 
structured  analysis  of  the  requirements  as  shown  in  the 
form  of  DFDs  and  data  dictionaries.  These  methods  have 
some  applicatitm  in  the  re-engineering  environment.  If 
the  DFDs  and  data  dictionaries  can  be  extracted  from 
existing  code  using  a  reverse  engineering  tool  [STSC92]. 
then  the  DFDs  and  data  dictionaries  can  be  used  to 
develop  an  OOD.  A  discussion  of  this  technique  follows 
in  the  'issues’  section  below. 

There  has  be^  some  work  in  automatically 
converting  code  from  one  procedural  language  to  another 
[Olsem93].  The  Re-engineering  Technology  Report 
[01sem93]  provides  a  IcHig  list  of  source-code  translators 
from  different  vendors.  For  example,  some  systems 
convert  from  Fortran  to  C  or  C-M-,  and  others  convert 
from  Pascal  to  Ada.  Even  if  these  translators  convert  to 
an  object-oriented  language,  they  do  not  recognize 
objects  during  the  conversion,  so  they  are  not  building 
object-oriented  code.  They  are  simply  converting  to 
procedural  style  code  written  in  a  hybrid  object-oriented 
language  such  as  C4-f . 

Ong  and  Tsai  [Ong93]  have  developed  a  method  for 
extracting  classes  and  objects  directly  from  procedural 
code  (Fortran).  Some  objects  can  be  found  with  hints 
from  the  user,  and  others  are  pulled  directly  from  the 
common  data  block  of  the  Fortran  code.  Their  method 
relies  on  data  flow  analysis  of  the  Fortran  code  to  extract 
methods  from  Fortran  code.  These  methods  are  placed  in 
the  objects  extracted  earlier.  They  do  not  build  an 
intermediate  representation  of  the  Fortran  code,  but  go 
directly  from  Fortran  to  C++  code.  Their  prototype 
syston  generates  C++  code  for  the  classes  and  methods 
extracted. 

Recent  work  by  Jacobson  [Jacob91]  provides  an 
incremental  method  to  re-oigineer  procedural  code  to 
object-oriented  code.  Using  this  method,  a  part  of  the 
existing  system  is  selected  for  re-engineering,  and  an 
interface  to  this  new  object-oriented  part  of  the  system  is 
developed.  During  the  conversion,  the  designer  reverse 
engineers  to  a  formal  design  diagram  and  extracts  the 
objects  from  this  formal  design  diagram.  Jacobson's 
method  develops  an  intermediate  representation  of  the 
procedural  code  before  converting  that  representation  to 
object-oriented  code.  The  designer  is  still  left  with  the 


task  of  fmding  the  objects  in  the  formal  design  diagram 
created  from  the  reverse  engineering. 

Re-Engineering  Issues 

There  ate  certain  issues  that  organizations  must 
consider  before  re-engineering  a  system.  These  issues 
include  familiarity,  completeness,  existing  designs,  and 
when  to  re-engineer. 

Familiarity.  A  pervasive  issue  in  re-engineering  a  system 
is  how  familiar  you  should  become  with  the  procedural 
system.  If  you  have  beat  using  the  system,  you  will 
already  be  familiar  with  the  system,  its  overall  purpose, 
functionality,  and  geiteral  features.  But.  how  much 
fantiliarity  with  the  low-level  fiuictions  of  the  system  is 
required?  To  what  level  of  detail  should  you  analyze  the 
system? 

Jacobson  [Jacob91]  proposes  a  procedural  system 
must  be  reverse  engineered  all  the  way  back  to  a  formal 
design  before  the  conversion.  This  could  be  a  lengthy 
process  for  a  large  system,  but  it  will  help  you  become 
familiar  with  the  system.  Using  this  method  (even  with 
the  incremental  nature  of  the  method)  the  reverse 
engineering  process  alone  could  take  years  for  large 
systems.  Is  it  really  necessary  to  know  the  system  in  this 
much  detail?  Is  it  necessary  to  have  an  exact  fomial 
design  diagram  for  the  system?  How  do  you  know  that 
you  are  familiar  enough  with  the  system  to  begin  the 
conversion? 

If  you  already  have  a  formal  design  of  the  system, 
then  you  may  want  to  use  that  design.  If  there  is  not  one 
available,  it  is  our  opinion  that  an  exact  formal  design 
diagram  is  not  needed.  The  person  doing  the  conversion 
(the  convoter)  must  be  somewhat  familiar  and 
comfortable  with  the  functions  of  the  system,  but  the 
level  of  this  familiarity  can  be  quite  fuzzy.  Some  of  the 
familiarity  will  come  from  the  iterative  nature  of  the 
conversion  process.  The  more  the  converter  works  with 
the  procedural  code,  the  more  familiar  he/she  will 
become. 

As  the  converta  becomes  more  familiar  with  the 
system,  they  should  be  looking  for  things  that  may  later 
become  objects  in  the  object-oriented  code.  Can  these 
objects  be  found  only  in  requirements  specifications? 
Many  OOA  methods  [Booch91,  Coad91]  start  only  from 
requirements  specifications.  If  the  objects  can  be  found 
only  in  requirements  specifications,  then  the  converter 
must  look  for  requirements  of  the  system  when  becoming 
familiar  with  die  system. 


If  objects  can  be  found  in  other  places  than  the 
requirements  speciflcations.  then  the  converter  should 
look  for  other  things  while  becoming  familiar  with  the 
system.  Ong  and  Tsai  [C)ng93]  use  the  data  structures 
found  in  Fortran  code  to  extract  objects.  Objects  are 
often  found  in  other  things  than  just  the  requirements,  so 
the  converter  should  be  looking  for  other  things  than  just 
requirements.  The  converter  should  be  looking  for  the 
services  being  performed  by  the  modules  of  the 
procedural  system.  These  services  will  be  used  to 
identify  the  b^vior  of  objects  in  the  new  system. 

So,  when  you  are  becoming  familiar  with  a 
procedural  system,  should  you  be  looking  for 
requirements  or  just  services  of  the  system?  It  is  our 
opinion  that  you  should  look  for  services  of  the  system, 
lliese  services  will  tell  you  the  current  functionality  of 
the  system,  and  will  help  you  determine  the  objects 
requi^  in  the  system.  The  services  will  also  help 
document  the  progress  of  the  conversion  as  described 
below. 

Completeness.  An  important  question  for  the  converter  is 
*Whm  is  the  conversion  done?"  The  omverter  must  be 
able  to  measure  the  progress  of  the  conversion  and 
determine  when  the  job  is  complete.  One  measure  of  this 
is  whether  all  the  ser^dces  or  requirements  of  the  original 
system  ate  in  the  converted  system. 

If  the  converter  has  access  to  the  origiiuil 
requirements  document,  they  can  reference  this  docummt 
to  measure  progress  of  the  conversion.  Of  course,  if  the 
converter  has  the  original  requirements  document,  they 
may  want  to  do  OOA  on  the  original  requirements. 
These  requirements  may  not  accurately  reflect  the 
functionality  of  the  current  system,  so  the  converter  may 
need  to  rely  on  the  services  identified  while  becoming 
familiar  with  the  system. 

If  no  requirements  are  available,  the  converter  must 
rely  on  the  services  found  in  the  system  to  measure  the 
progress  and  completeness  of  the  conversion.  A  list  of 
the  services  should  be  compiled  by  looking  at  the 
existing  code.  Our  method,  presented  below,  includes 
this  step. 

Existing  Design.  What  if  there  is  already  an  existing 
procedural  design  for  the  system?  To  what  extent  should 
the  converter  use  this  design?  Alabiso's  method 
[AlabisoSS]  converts  a  DFD  to  an  object  design.  But, 
using  the  functionally  deconq)osed  design  can  cause  the 
designer  to  miss  some  of  the  objects  that  should  be  built 
into  the  object  design.  [Sward93]  If  at  all  possible,  the 


designer  should  not  use  an  existing  design  to  build  an 
object  design  because  the  functional  decomposition  that 
was  done  to  build  Ok  existing  design  will  taint  the  object 
design. 

The  existing  design  can  be  used  to  become  familiar 
with  the  system.  The  converter  can  look  at  an  existing 
design  to  find  out  how  procedures  interact.  The  designer 
may  even  be  able  to  And  high  level  services  that  the 
system  is  providing  by  looking  at  the  design. 

When  to  re-engineer.  Sittenauer  and  Olsem  [Sitten92] 
present  a  model  to  estimate  whether  or  not  an 
organization  should  re-engineer  or  not  The  question  is  a 
good  one.  How  does  an  organization  know  when  to  re¬ 
engineer?  Are  there  measurable  indications  of  the 
appropriate  time  to  re-engineer?  The  Sitterutuer  and 
Olsem  model  examines  the  complexity,  importance,  and 
lifetime  of  a  candidate  system.  Th^  indicators  are 
measured  and  graphed  to  help  the  organization  determine 
if  the  system  should  be  re-engineered. 

All  of  these  issues  must  be  ctmsidered  before  re¬ 
engineering.  The  next  section  presents  our  rationale  for 
using  natural  language  descriptions  when  becoming 
familiar  with  the  procedural  system. 

Rationale  for  Natural  Language 

After  an  organization  has  determined  it  is  time  to  re¬ 
engineer,  the  first  step  is  to  become  familiar  with  the 
system  and  determine  the  services  provided  by  the 
system.  We  are  proposing  that  these  services  are  best 
described  using  natural  language  fcv  the  following 
reasons:  1)  natural  language  will  ctq)ture  the  terminology 
of  the  problem  statement  that  has  I^n  instilled  into  the 
procedural  code,  2)  natural  language  provides  an 
appropriate  intermediate  level  of  abstraction,  and  3) 
natural  language  descriptions  will  provide  an  input  into 
object-oriented  forward  engineering. 

Terminology.  As  the  converter  of  a  system  becomes 
familiar  with  that  system,  they  need  to  learn  the 
terminology  of  the  system.  The  tenninology  of  the 
system  holds  clues  to  die  objects  that  will  be  built  in  the 
object-oriented  system.  One  way  to  leam  this 
terminology  is  to  write  natural  language  descriptions  of 
the  services  provided  by  the  system.  Representing 
services  in  natural  language  forces  the  converter  to 
discover  the  tominology  of  the  procedural  system.  More 
importantly,  it  helps  the  converter  discover  the 
terminology  of  the  original  problem  statement  As  the 
converter  describes  the  services  of  the  system  in  natural 
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language,  they  may  begin  to  get  a  feel  for  some  of  the 
objects  that  will  appear  in  the  object  system. 


a  similar  form  to  requirements  statements,  so  (X)A  on  the 
descriptions  is  a  similar  task. 


Level  of  Abstraction.  In  the  process  of  reverse 
engineering  we  develop  an  intermediate  representation  of 
the  existing  system.  This  representation  must  be  at  the 
right  level  of  abstraction.  If  it  is  not  abstract,  there  may 
still  be  remnants  of  the  procedural  design  in  the 
representation  and  these  will  surface  in  the  object  design. 
If  the  representation  is  too  abstract,  it  may  be  too  time 
consuming  to  build  without  providing  much  added  value. 

Alabiso's  method  [Alabiso88]  of  converting  DFD's 
and  data  dictionaries  to  objects  does  not  provide  an 
appropriate  level  of  abstraction.  If  the  DFD's  and  data 
dictionaries  are  extracted  from  the  procedural  code,  they 
will  still  be  tainted  with  the  functional  decomposition 
done  to  build  the  designs.  Ong  and  Tsai's  method 
[Ong93]  is  also  tainted  by  the  functional  decomposition 
when  it  looks  for  objects  in  the  data  structures  of  Fortran. 
The  objects  in  the  reject  design  are  not  necessarily  tied 
to  the  data  structures  in  Fbrtran.  The  data  structures  can 
give  us  clues  to  the  objects  in  the  object  design,  but  there 
is  no  one-to-one  correlation. 

Jacobson's  method  [Jacob911  on  the  other  hand, 
requires  we  build  a  formal  design  describing  the  existing 
system.  To  develop  a  design  at  this  level  of  abstraction 
takes  a  considerable  amount  of  work.  This  may  be  too 
time  consuming  for  large  systems  without  inoviding 
much  more  capability  in  the  design.  We  need  a 
representation  that  is  scnnewhere  in  between  these  two 
levels  of  abstraction. 

Our  method  provides  this  intermediate  level  of 
abstraction  by  using  natural  language  descriptions  of  the 
services  of  the  existing  system.  Natural  language 
descriptions  do  not  have  any  procedural  flavor  to  them, 
and  they  are  not  time  consuming  to  construct.  They 
provide  an  acceptable  level  of  abstraction  for 
representing  procedural  code  and  can  be  used  for  object- 
oriented  forward  engineering.  The  descriptions  are  built 
using  the  data  structures  and  procedures  in  the  existing 
system.  Objects  do  not  cmne  directly  from  these  data 
structures  or  procedures,  but  come  from  OOA  on  the 
descriptions  of  the  services  provided  by  them. 

Forward  engineering.  The  natural  language  descriptions 
that  are  built  will  be  used  as  input  to  object-oriented 
forward  engineering.  Most  OOA  methodologies  use 
requirements  statements  written  in  natural  language.  The 
natural  language  descriptions  built  with  our  method  take 
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Another  benefit  of  the  natural  language  descriptions 
is  that  designers  already  familiar  with  OOA 
methodologies  do  not  need  to  learn  a  new  methodology 
to  design  from  the  descriptions.  The  expertise  gained 
already  in  finding  objects  can  be  used  to  fmd  objects  in 
the  descriptions.  Thus,  organizations  will  not  need  to 
spend  more  money  training  their  designers. 

We  have  aiudyzed  the  issues  in  re-engineering 
presented  earlier  and  for  the  above  reasons  have  chosen 
natural  language  descriptions  as  the  abstract 
representation  produced  from  our  reverse  engineering 
method.  Our  method  is  described  fully  below. 

The  Method 

This  section  presents  a  method  for  becoming  familiar 
with  a  procedural  system  so  that  it  can  be  converted  to 
the  object  paradigm.  The  method  helps  the  user  develop 
natural  language  descriptions  of  die  services  of  the 
IHOcedural  system.  These  natural  language  descriptions 
provide  the  raw  material  for  OOA. 


I.  Describe  the  data  structures 

II.  Describe  the  procedures 


Figure  1  •  Phases  of  the  Method 

The  method  is  split  into  two  phases  as  shown  in 
figure  1.  The  first  phase  involves  describing  the  data 
structures  and  listing  where  the  data  structures  are 
defmed.  The  second  phase  looks  at  the  procedures  and 
describes  the  services  being  performed  by  them. 

Phase  /.  This  phase  of  the  method  lets  the  converter 
become  more  familiar  with  the  data  structures  and  the 
infornuition  being  stored.  Figure  2  riiows  the  three  steps 
of  Phase  I. 


1 .  List  the  data  structures 

2.  List  where  the  data  structures  are 
defined 

3.  Write  a  natural  language  description 
of  the  data  stnjctures 


Figure  2  •  Steps  in  Phase  i 

The  first  step  is  to  list  all  the  data  structures  in  the 
procedural  system.  These  may  be  defmed  in  several 


different  units  of  the  system  or  in  a  separate  unit  just  for 
definitions.  The  next  step  is  to  list  where  these  data 
structures  are  located.  This  list  is  used  for  future 
reference.  The  third  step  is  the  most  important  one  of  this 
phase  of  the  method.  In  this  step,  the  converter  writes  a 
natural  language  description  of  each  data  structure.  As 
described  above,  this  type  of  description  helps  the 
converter  become  familiar  with  the  terminology  of  the 
procedural  system. 

Since  the  natural  language  description  is  important, 
we  provide  an  example  description  of  the  following 
Pascal  data  structure: 


Module.Ust  -  RECORD 

mlist  :ARRAY[1..Max_Modules]OF 
Module_Ptr; 
mcnt  :  INTEGER; 

END; _ 


The  natural  language  description  of  this  data 
structure  might  be: 


Module  list  is  a  record  with  two  fields: 

1)  a  list  of  pointers  to  modules  stored  in 
an  array  from  1  to  the  maximum 
number  of  modules  and 

2)  an  integer  that  points  to  the  last 

module  pointer  in  the  array. _ 

Once  the  natural  language  descriptions  are  built  for 
all  the  data  structures,  the  converts  pixx:eeds  to  Phase  n. 

Phase  //.  This  phase  of  the  mediod  deals  with  the 
procedures  of  the  existing  system.  In  this  jdiase,  the 
converter  tries  to  extract  the  services  being  peiformed  by 
the  procedures.  The  steps  of  this  part  of  Uk  method  are 
shown  in  figure  3. 

In  this  phase  of  the  method,  the  converter  should  get 
a  sense  of  the  mtire  procedural  system.  As  the  converter 
builds  the  natural  language  descriptions  of  the  procedural 
processing,  he  should  better  understand  the  services  that 
the  procedural  system  accomplishes. 


4.  List  all  the  procedures  in  the  system 

5.  List  the  parameters  for  each  of  the 
procedures 

6.  Focus  on  one  data  structure 

7.  List  the  parts  of  the  data  stmcture 
each  procedure  uses 

8.  Write  a  natural  language  description 
of  the  processing  for  each  procedite 

9.  Describe  the  impact  on  the  selected 
data  stnjcture  using  natural  language 

10.  List  any  services  found 

1 1 .  Repeat  steps  6-10  until  all  the  data 
structures  have  been  considered 

12.  Write  natural  language  descriptions 

for  any  remaining  procedures _ 


Figure  3  -  Steps  in  Phase  II 

In  step  4,  the  converter  lists  the  unit  and  procedure 
names.  This  is  simply  a  reference  list  for  the  converter  to 
make  sure  all  the  functions  in  the  system  have  a  natural 
language  description.  The  next  step  is  to  add  to  the  list 
of  procedures  all  the  parameters  for  the  procedures.  This 
is  another  list  that  helps  the  ccmveiter  become  familiar 
with  the  processing  and  teiminology  of  the  system.  An 
exanqjle  list  is  shown  below. 


Procedure 

Earameters 

edit_name 

design_rec 

edit_author 

design_rec 

In  step  6,  the  converter  picks  one  data  structure  and 
focuses  attention  on  it.  We  suggest  starting  with  data 
structures  that  have  multiple  attributes  such  as  a  Pascal 
record  structure  or  a  C  struct  structure.  Which  data 
structure  to  focus  on  is  not  crucial  to  the  conversion 
process,  but  we  recommend  starting  with  large  data 
structures  because  they  tend  to  be  central  to  more  system 
processes. 

An  example  of  a  Pascal  record  structure  is  shown 
below. 


design_rec  =  RECORD 
dname  :  STRING[30]; 

fdir  :  STRINGI8]: 

modules  ;  module_list_type 

author  :  STRINGPOJ; 
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In  step  7,  the  converter  lists  the  parts  of  the  data 
structure  each  procedure  in  the  system  uses  or  modifies. 


The  convener  should  prepare  a  list  with  each  procedure 
name  followed  by  the  parts  of  the  data  structure  used  or 
modified  in  the  procedure. 

It  may  be  UtC  case  that  the  procedure  passes  the 
entire  data  structure  into  the  procedure,  but  only  uses  pan 
of  the  structure.  It  is  at  this  point  the  convener  should 
make  every  effon  to  partition  out  this  tramp  data.  Tramp 
data  is  data  extraneous  to  the  processing  of  the  procedure. 
The  converter  should  list  oiily  those  parts  of  the  data 
structure  that  ate  used  or  modified  in  the  procedure. 

The  list  of  procedures  and  parameters  built  in  step  5 
is  helpful  here.  The  converter  can  look  on  the  list  to  see 
if  the  selected  data  struct,  e  is  itKluded  in  the  parameter 
list  for  a  procedure.  If  it  is,  the  converter  examines  the 
extent  the  data  structure  is  used  or  modified  in  the 
procedure.  Th :  parts  of  the  data  structure  that  are  used  or 
modified  should  be  included  in  the  list  built  for  this  step. 

An  example  is  shown  below  for  the  ediijtame 
procedure.  Tlve  entire  design_rec  data  structure  is  passed 
as  a  parameter  to  ediijiame.  However,  the  only  pan  of 
the  design_rec  data  structure  that  is  mot^ed  in 
editjtame  is  dname. 


edit_nafne _ dname 


At  this  point,  the  ccHiverter  will  be  getting  a  better 
idea  about  what  parts  of  the  selected  data  structure  are 
used  in  the  different  procedures.  It  is  a  logical  next  step 
to  describe  the  processing  of  these  parts  of  the  data 
structure  in  natural  language.  In  step  8.  the  converter 
takes  the  lists  of  procedures  and  parts  of  the  data 
structure  as  defined  in  step  7  and  writes  natural  language 
descriptions  for  each  of  the  procedures. 

The  convertn’  should  use  only  natural  language  to 
describe  the  processing  of  the  procedure  and  avoid  shon- 
cuts  such  as  using  the  procedure  name  or  the  data 
structure  name.  If  there  are  any  procedures  that  are 
called  from  inside  this  procedure,  the  converter  should 
develop  natural  language  descriptions  of  these  procedures 
before  moving  on.  This  allows  the  converter  to  follow 
and  understand  the  processing  going  on  in  the  procedure. 
An  example  of  the  natural  language  description  of 
edit_name  is  shown  below. 


The  procedure  that  edits  the  name  of  a 
design  is  passed  the  entire  design  record.  It 
changes  the  value  of  tho  name  of  the  design 
by  prompting  the  user  for  a  new  value  and 
then  setting  the  design  name  to  this  new 
value.  The  new  value  is  checked  against  all 
the  old  design  names  to  assure  no 
duplicates  are  permitted. 


The  next  step,  step  9.  is  to  expand  the  descriptions 
built  in  step  8.  The  descriptions  are  expanded  by 
including  the  impact  on  the  data  structure  by  each 
(Mxxredure.  This  is  where  the  converter  can  show  exactly 
to  what  extent  the  data  suucture  is  changed  by  the 
procedure.  Any  tramp  data  can  be  delineated  clearly  in 
this  step.  By  now.  the  converter  should  have  a  good  feel 
for  the  terminology  of  the  procedural  system  and  the 
services  being  performed  on  the  selected  data  structure. 
An  example  of  the  natural  language  generated  for 
edit_name  during  step  9  is  shown  below. 


The  procedure  that  adits  the  name  of  a 
design  is  passed  the  entire  design  record.  It 
changes  the  value  of  the  name  of  the  design 
by  prompting  the  user  for  a  new  value  and 
then  setting  the  design  name  to  this  new 
value.  The  new  value  is  checked  against  all 
the  old  design  names  to  ensure  there  are  no 
duplicates. 

This  procedure  does  not  change  the 
entire  design  record.  Only  the  name  field  of 
the  design  record  is  modified.  The  entire 
processing  involves  the  name  of  the  design 
and  the  list  of  all  the  other  design  names. 


Step  1 1  in  this  phase  of  the  method  is  iteration.  In 
this  step,  the  converter  repeats  steps  6  through  10 
choosing  a  new  data  sfructuie  to  focus  on  (from  step  6 
This  iteration  continues  until  all  the  data  structures  ha'  e 
been  considered.  It  could  be  the  case  that  some 
procedures  do  not  have  parameters  and  will  not  be 
considered  in  steps  6-11.  Step  12  provides  for  this  by 
requiring  the  converter  to  make  sure  all  procedures  have 
natural  language  descriptions  built  for  them. 

After  step  12.  the  converter  has  natural  language 
descriptions  of  all  the  data  structures  and  procedures  in 
the  procedural  system.  The  converter  can  now  proceed  to 
forward  engineer  the  system  using  whatever  OOA  and 
OOD  method  they  prefer. 
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Conclusion 


|Jacob921 


The  reverse  engineering  method  presented  in  this 
paper  provides  an  intermediate  representation  of 
procedural  code.  This  representation  is  at  an  appropriate 
level  of  abstraction  in  that  it  is  not  too  close  to  the 
existing  procedural  code  or  so  abstract  that  it  takes  great 
amounts  of  work  to  create.  The  natural  language 
descriptions  are  built  during  the  time  a  converter  is 
beconting  familiar  with  the  system.  The  descriptions 
help  the  converter  become  more  familiar  with  the 
services  provided  by  the  system.  These  services  define 
the  completeness  and  fimcdonality  of  the  new  object- 
oriented  system.  The  descriptions  provide  a  medium  to 
forward  engineer  using  an  object-oriented  methodology 
with  which  the  converter  is  familiar. 
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Future  Research 


In  the  future,  we  hope  to  partially  automate  our 
reverse  engineering  method.  The  generation  of  the  lists 
in  steps  1,  2,  4,  and  S  can  easily  be  automated,  whereas 
the  generation  the  natural  language  descriptions  is  a 
somewhat  more  difficult  artificial  intelligence  problem. 
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Abstract 

We  present  results  from  a  case  study  on  the  use  of 
object  oriented  techniques  to  reengineer  the  Ballistic 
Trajectory  Algorithm  (BTA)  software.  The  BTA 
software  is  used  in  the  Operational  Flight  Programs 
(OFPs)  of  several  Navy  attack  and  fighter  aircraft  to 
determine  the  release  point  for  ballistic  weapons. 
Though  the  algcxithm  itself  is  simple,  the  nature  of 
its  implementation  in  space  and  time-limited  OFPs 
has  resulted  in  an  extremely  complex 
implementation — Qrpical  of  OFP  code.  The  structure 
of  the  basic  algorithm  is  entirely  hidden  by  the 
embedding  of  the  integration  methi^  and  scheduling 
considerations  into  the  algorithm  and  by  the  large 
number  of  qiecial  cases  introduced  to  handle  specific 
weapons.  Existing  documentation  of  the  basic 
algorithm  is  totally  inadequate  for  understanding  the 
actual  iKOgram.  This  paper  provides  a  demonstration 
of  how  object-oriented  properties  such  as  inheritance 
can  be  used  to  control  the  complexity  of  the 
implementation,  yielding  a  much  more 
understandable,  maintainable,  and  reusable  program, 
and  describes  the  methodology  used  and  lessons 
learned  in  the  reengineering  effort. 

1  Introduction 

Since  the  Brst  airborne  digital  computers  were  introduced 
in  operational  Navy  attack  aircraft  in  the  ISlbOs,  significant 
amounts  of  resources  have  been  invested  in  the  design, 
development,  and  maintenance  of  avionics  software  systems, 
most  of  which  are  written  in  assembly  language  and  all  of 
which  run  on  obsolete  computers  by  today's  standards.  The 
existing  engineering  data,  which  include  the  assembly  codes, 
structured  flowcharts  (called  math  flows),  detailed  test 
procedures  for  safety  of  flight  and  Fleet  certification,  and 
tactical  manuals,  are  insufficient  to  support  redevelopment 
using  modem  software  engineering  technologies.  Indeed,  for 


a  recent  upgrade  to  the  hardware  used  by  an  OFP, 
redevelopment  of  the  operatioiud  weapon  systems  software 
for  a  new  computer  was  deemed  too  risky  and  costly. 
Consequently,  new  hardware  was  developed  to  emulate  the 
obsolete  mission  computer  in  order  that  the  current  software 
could  be  run  with  limited  changes  in  the  new  hardware. 
Possible  concerns  are 

1 .  Long  change  cycles  on  the  order  of  18-24  months  to 
fix  errors  and  enhance  functionality 

2.  Eiqrensive,  labor-intensive,  erm'-prone  procedures  to 
analyze,  design,  code,  and  test  changes 

3.  Difficulty  in  restructuring  the  software  to 
accommo^te  signifleant  modifications 

4.  Chronic  shortages  of  sufficient  processor  throughput 
and  memory,  since  the  fragility  and  machine-specific 
nature  of  the  software  inhibit  migration  to  new 
hardware 

In  the  case  of  the  Ballistic  Algorithm,  the  program  is 
currently  stable,  and  "maintenance''  in  the  sense  of  flxing 
bugs  is  not  really  a  problem.  The  real  problem  is  that 
several  characteristics  of  the  OFP  (e.g.,  scheduling  period, 
integration  algorithm,  and  specifle  weapon  types)  are  built 
into  the  code,  such  that  the  code  is  so  complex  that  it  is  not 
feasible  to  consider  improving  the  methods  used  in  the 
ballistic  algorithm.  This  results  in  time-critical  code  that 
cannot  be  improved,  even  though  more  up-to-date  numerical 
methods  that  could  improve  performance  are  available. 
Considerable  man  hours  are  spent  integrating  new  weapon 
types  into  the  program,  due  primarily  to  a  largely  manual 
process  of  curve  fitting  for  lift,  drag,  and  other  data,  where 
changes  to  the  method  used  in  the  basic  algorithm  might 
eliminate  the  need  for  this  process  entirely.  The  complexity 
of  the  code  has  increased  with  time  as  new  wetqxm  are  added 
and  as  it  is  nearly  impossible  to  remove  anything  from  the 
code  because  of  the  possible  side  effects  of  removing 
obsolete  code. 
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Because  of  the  smaller  defense  budget,  the  recent  need  to 
cut  operational  costs  within  the  Navy  and  the  Department  of 
Defense  in  general  has  provided  additional  incentive  to 
develop  techniques  for  reengineering  deployed  avionics 
software  to  reduce  life-cycle  costs.  This  paper  examines  the 
contribution  that  object-oriented  programming  (OOP) 
technology  can  make  to  reengineering  legacy  systems.  Our 
reengineering  expectations  for  an  end-product  are 

1.  Reengineering  techniques  that  form  the  foundation 
necessary  as  a  first  step  towards  reengineering 
avionics  software  in  real-time 

2.  A  simpler,  less  complex  code  structure  that  is  easier 
to  understand 

3.  Reduced  effwt  required  to  add  new  and/or  modify 
existing  avionics  software  functionalities,  weapon 
types,  or  release  conditions 

4.  Retention  of  the  same  or  higher  degrees  of  accuracy  as 
currendy  provided  in  the  existing  avionics  software 

5 .  Reduced  life-cycle  costs  of  current  avionics  software 

6.  An  existence  proof  that  it  is  feasible  to  contemplate 
reengineering  deployed  avionics  software  using 
modem  software  engineering  technologies  to  reduce 
life-cycle  costs. 

As  representative  of  the  computationally  intensive 
operational  software  in  Navy  aircraft,  the  BTA  software  was 
selected  as  the  target  for  reengineering.  Our  paper  presents 
results  from  a  case  study  to  (1)  investigate  and  evaluate 
techniques  for  reengineering  the  BTA  software  using  detailed 
requirements  and  design  information  extracted  from  the 
existing  programs  and  (2)  demonstrate  these  techniques  on 
selected  components  of  the  BTA  software.  We  Hrst  describe 
the  BTA  software,  and  then  discuss  its  analysis  and  redesign 
in  the  reengineering  process.  Our  experiences  and  results  in 
the  reengineering  process  are  presented. 

2  Ballistic  trajectory  algorithm 

Most  unguided,  free-fall,  air-to-ground  weapons  launched 
from  aircraft  to  attack  targets  on  the  ground  or  surface,  e.g.. 
bullets,  drogued  or  retarded  bombs,  cluster  munitions, 
streamlined  bombs,  and  unguided  rockets,  can  be  described  as 
ballistic  weapons  [1].  Forces  considered  on  these  weapons 
after  release  from  the  aircraft  are  lift,  gravity,  and 
aerodynamic  drag.  A  computational  requirement  imposed  on 


airborne  digital  computers  aboard  tactical  Navy  attack  aircraft 
is  the  prediction  of  range  and  time  of  fall  for  these  ballistic 
weapons,  given  the  following: 

1 .  Position  of  the  target  relative  to  the  aircraft 

2.  Velocity  of  the  aircraft  relative  to  the  air  mass 

3.  Direction  and  magnitude  of  gravity 

4.  Velocity  of  the  air  mass  relative  to  the  ground 

5.  Velocity  of  the  target  relative  to  the  ground 

These  sensor-supplied  quantities  makeup  the  weapon- 
release  conditions  as  supplied  by  aircraft  sensors  (e.g., 
inertial  platforms,  air  data  sensors,  radar  or  laser  trackers, 
target  tracking  devices). 

Given  the  above  weapon-release  conditions,  the 
successful  release  of  a  ballistic  weapon  so  that  it  impacts  at 
a  desired  point  on  the  ground  requires  an  a  priori  pr^ction 
of  the  weapon's  trajectory.  This  problem  is  illustrated  in 
Figure  I. 


Figure  1.  Ballistic  Trajectory  In  the  X-Y  Plane 


As  depicted  in  Figure  1,  everything  necessary  is  assumed 
known  about  the  release  point  The  impact  range  and  time  of 
flight  are  to  be  computed.  The  BTA  software  provides  this 
important  functionality.  Initially,  the  BTA  software  was 
designed  for  single-stage,  free-fall  bombs,  but  has  been 
modified  over  the  years  to  accommodate  newer, 
technologically  advanced  weapons,  such  as  rockets  and  free- 
fall  bombs  with  lifting  surfaces. 
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Currently,  two  versions  of  the  BTA  software  exist.  The 
first  is  the  actual  real-time  version  written  in  assembly 
language  for  the  aiitorne  digital  computers  aboard  the 
tactical  attack  aircraft.  The  second  version  consists  of  76 
FORTRAN  routines.  The  FORTRAN  version  is  used 
mainly  to  fine-tune  weapon  coefficients  for  the  real-time 
version  and  is  also  used  as  a  test  program  for  safety  of  flight 
and  Fleet  ceitification. 

While  the  BTA  software  has  been  used  successfully  since 
its  development  in  the  late  1960s,  several  of  the  computing 
constraints  (e.g.,  speed,  storage  capacity,  numerical 
algorithms)  that  confironted  its  design  at  the  time  no  longer 
exist  today.  As  currently  designed  and  implemented,  the 
BTA  software  does  not  provide  sufficient  flexibility  to  allow 
easy  modification  of  weapon  types  or  release  conditions. 
One  consequence  has  been  that,  over  its  2S-year  life  span, 
only  small  modifications  have  been  attempted.  The  result  is 
code  tangled  fiom  years  of  patchwork. 

Given  the  investigative  nature  of  our  task  and  the  time 
and  other  resource  constraints  that  are  attributable  in  part  to 
limited  funds,  we  selected  the  ALGO  subroutine  component 
of  the  BTA  software  as  the  target  for  reengineering.  ALGO 
is  the  'main'  subroutine  that  forms  the  heart  of  the 
FORTRAN  version  of  the  BTA  software.  Though  of 
moderate  size,  the  high  degree  of  complexity  (illustrated  in 
Figure  2),  inherent  in  the  ALGO  subroutine  is  representative 
of  both  the  BTA  software  and  the  existing  avionics  software 
systems  in  general. 


3  Analysis 

Our  main  task  in  the  reengineering  process  was  the 
analysis  of  the  BTA  software.  The  motivation  behind  the 
analysis  is  twofold: 

1.  To  recover  and  capture  the  essential  design  and 
requirement  information  necessary  for  redesign  and  re¬ 
development 

2.  To  identify  the  presence  of  software  engineering 
anomalies  (e.g.,  non-modularity,  code/data 
dependency,  and  lack  of  clarity)  in  the  existing  code 

A  common  barrier  to  software  reengineering  is  the  difficulty 
of  understanding  the  design  assumptions  and  constraints 
embedded  in  a  piece  of  code  [2].  Our  own  experience 
confirms  this;  the  design  and  requirement  information  about 
what  the  BTA  software  does  is  inextricably  embedded  in  the 
code.  The  code  itself,  tangled  from  years  of  patchwork, 
describes  not  what  was  done,  but  rather  how  it  was  done. 


ALGO  (A) 

Language:  Fortran 
Design  Graph  Superimposed 
Cyciomstic  151 
Esaentiai  37 
Design  15 


Figure  2.  Complexity  Analysis  of  ALGO 
subroutine  component  of  the  BTA  software 


In  analyzing  the  ALGO  subroutine  component,  we 
examined  the  7S1  lines  of  FORTRAN  code  line  by  line  to 
recover  the  design  and  requirements  information.  This 
process  was  both  time-  and  labor-intensive.  In  one  instance 
for  example,  recovering  the  implementation  of  a  second- 
order  Runge  Kutta  numerical  integration  method  took  in 
excess  of  40  man-hours.  This  intrinsic  complexity,  depicted 
in  Figure  2,  stems  flom  the  original  design. 

The  cyclomatic  complexity  value  of  151  (Figure  2), 
represents  a  measure  of  the  ALGO  subroutine's  decision 
structure.  The  essential  and  design  values  of  37  and  IS 
represent  the  degree  to  which  the  module  contains 
unstructured  consuucts  (i.e.,  branching  in  and  out  of  loops 
and  decision  nodes)  and  the  complexity  of  the  module's 
calling  patterns  respectively.  According  to  McCabe,  a 
module  whose  flow  graph  e  edges  and  n  nodes  has  a 
cyclomatic  complexity  of  e  -  n  +  2  [3].  A  cyclomatic 
complexity  value  between  12  and  20  is  considered  to  be 
within  the  norms  of  traditional  software  engineering. 

The  analysis  effort  on  the  ALGO  subroutine  was  a  five- 
step  process. 
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In  the  first  step,  all  loops  and  code  blocks  within  the 
executable  part  of  the  subroutine  were  identified  and  assigned 
names  and/or  block  labels. 

In  the  second  step,  we  identified  all  input/output  (I/O) 
processing  and  separated  this  from  the  non-I/O  processing. 
The  set  of  actual  I^  data,  iiKluding  data  structures,  was  also 
identifred.  The  data  structures  identifred  in  this  step  included 
(1)  five  two-dimensional  REAL  arrays,  (2)  a  one- 
dimensional  REAL  array,  and  (3)  two  one-dimensional 
INTEGER  arrays. 

In  the  third  siqi,  all  hard-coded  literals  and  constants  in 
the  executable  part  of  the  code  were  identified.  The  intent 
was  to  decouple  physical  data  from  the  actual  code.  Named- 
constants  were  then  declared,  and  the  literals  and  constants 
within  the  executable  part  that  could  be  rqilaced  were  then 
replaced  with  named^onstants. 

In  the  fourth  stq>,  all  variable  references  throughout  the 
code  were  reconciled.  The  objective  was  to  identify  exactly 
where  and  how,  in  the  source  code,  each  variable  was 
referenced.  This  was  particularly  helpful  in  identifying  the 
variables  that  were  used  only  for  specific  weapon  types  and 
also  facilitated  the  design  of  our  object-class  structure  in  the 
redesign  phase. 

Finally,  in  the  fifth  stq>,  aU  commonly  duplicated  blocks 
of  code,  as  exemplified  in  Figure  3,  were  identified  for 
elimirution  by  the  establishment  of  reusable  methods  in  the 
dc^gn  phase. 

Once  analysis  of  the  code  was  complete,  the  redesign  and 
subsequent  redevelopmen..  in  SmallTalk-80*  was  initiated. 
The  intent  was  not  to  introduce  yet  another  language  such  as 
SmallTalk  into  the  long  list  of  languages  used  for  Navy 
software,  but  rather  to  use  SmallTalk  as  a  prototyping 
language  for  the  design  and  initial  implementation  ]x>rtion 
of  our  reengineering  task  and  then  complete  the  actual 
implementations  in  C-h-  and  Ada  to  see  how  many  of  the 
beneficial  Object  Oriented  concepts  of  SmallTalk-80  could 
be  retained  in  the  final  implementations. 


Figure  3.  A  revised  cross  section  of  the  BTA 
showing  commonly  duplicated  code 


*  SmallTalk-80  is  a  registered  trademark  of  ParcPIace 
Systems  Inc.,  Sunnyvale,  CA  94086 
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4  Redesign 

Our  choice  to  redevelop  the  BTA  software  and  to 
demonstrate  the  reengineering  efforts  in  SmallTalk-80  is 
threefold: 

1 .  Of  all  the  OOP  languages  and  systems  to  appear  over 
the  past  decade,  SmallTalk-80  remains  one  of  the  few 
'pure'  OOP  languages  that  possess  all  of  the  object- 
oriented  concqMs  [4]. 

2.  The  class,  object,  and  method  browsers  provided  in 
Smairralk-80  fulfill  our  need  for  a  tool  that  assists  in 
navigation  through  the  resulting  object-class  library. 

3.  The  SmallTalk-80  system  provides  an  excellent 
environment  for  incremental  development  and  testing 
of  classes  and  methods  as  they  are  developed  without 
the  need  for  development  of  complex  lest  ^vers. 

Although  the  initial  redesign  task  is  complete,  the 
subsequent  programming  task  is  still  ongoing  and  is 
therefore  not  discussed  at  great  length  here. 

Some  disadvantages  to  the  use  of  SmallTalk-80  for 
initial  implementation  of  this  type  of  software  include  the 
following: 

1.  Methods  were  not  available  to  read  in  the  large  data 
files  of  integer  and  floating-point  data  necessary  for 
testing  that  were  produced  by  FORTRAN  formatted 
VO.  However,  thew  methods  were  rapidly  constructed 
from  mediods  available  in  the  class  libraries  provided. 

2.  Methods  for  handling  common  scientific  data  types 
such  as  two-dimensional  arrays  were  not  directly 
available  and  had  to  be  constructed  from  more 
primitive  methods. 

3.  A  very  steep  learning  curve  is  required  to  use 
SmadiTadk-80  and  its  tool  suite. 

Our  main  challenge  in  the  redesign  process  was  the 
derivation  of  an  object-class  structure  that  supports  a  high 
degree  of  modularity,  encapsulation,  and  ease  of  maintenance 
for  the  ALGO  component  of  the  BTA  st^twaie.  In  deriving 
the  object-class  structure,  we  first  identified  the  physical 
objects  being  modeled  by  the  BTA,  and  then  we  determined 
the  fundamental  distinctions  between  these  physical  objects, 
as  well  as  the  commonalties  among  them.  This  information 
was  then  used  in  the  derivation  of  the  class  hierarchy  shown 
in  Figure  4. 


Figure  4.  Inheritance  structure  of  the 
redesigned  ALGO  classes 

4.1  BtalnitModel  class 


The  BtalnitModel  class  is  an  abstract  superclass  for  all 
ballistic  weapon  objects  modeled  in  ALGO.  Its  relationship 
with  the  other  classes  is  based  on  the  premise  that  for  every 
ballistic  weapon  trajectory,  it  is  necessary  to  initialize  the 
working  variables  to  their  initial  values  based  on  the 
weapon's  release  conditions.  The  BtalnitModel  class 
contains  class-instance  variables  used  for  every  ballistic 
weapon  and  implements  methods  for  the  initializations 
necessary  at  the  beginning  of  each  trajectory.  These 
initializations  include 

1 .  Initializations  required  to  account  for  the  effects  of 
altitude  fuzing 
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2.  Initializations  required  to  account  for  the  effects  of 
time  fuzing 

3.  Initializations  required  to  account  for  the  dynamic 
calculation  of  the  number  of  integration  steps  needed 
in  the  first  phase  of  the  weapon's  trajectory 

4.  Method  used  for  expressing  the  weapon's  drag  as  a 
function  of  its  release  velocity 

AJl  NumericalSolver  class — subclass  of  BtalnitModei 

The  NumericalSolver  class  implements  numerical 
methods  used  for  each  ballistic  weigwn  in  the  BTA  software. 
As  currently  implemented,  the  BTA  software  uses  a  second- 
order  Runge-Kutta  numerical  integration  method  for  solving 
ballistic  trajectory  equations.  This  functionality  is  provided 
by  the  methods  implemented  in  the  NumericalSolver 
class.  The  separation  of  the  functionality  provided  by  this 
class  from  that  provided  by  the  BalllstIcWeapon  class  is 
intended  to  provide  flexibility  for  future  enhancements, 
especially  for  easy  incorporation  of  new  integration  and 
other  numerical  analysis  methods. 

4  J  BallisticWeapon  class — subclass  of 

NumericalSolver 

The  BalHstlcWeapon  class  implements  methods  for 

1.  Determining  integration  stq)  sizes 

2.  Computing  accelerations  attributable  to  gravity, 
thrust,  and  the  wetqxxi's  drag 

3.  Making  final  corrections  in  the  computed  trajectory, 
using  either  the  straightLineMethod  or  the 
trajectoryExtrapolationMethod 

4.  Sequencing  the  various  functionalities 

The  six  different  methods,  each  of  which  may  be  used  for 
determining  the  step  sizes  within  each  phase,  implemented 
in  the  BalllstIcWeapon  class  are 

1.  SpecifiedMethod:  computes  step  sizes  as  predefined 
arithmetic  progression. 

2.  VelocityMeihod:  computes  step  size  based  on  the 
weapon's  velocity  at  the  beginning  of  the  step. 

3.  PressureMethod:  computes  non-constant  step  size 
based  on  the  reciprocal  of  the  dynamic  pressure  being 
experienced  by  the  weapon. 


4.  CompuiedMetkod:  computes  the  step  size  as  a 
constant  based  on  the  number  of  integration  steps  left  in  the 
trajectory  and  the  total  estimated  time  of  fall. 

5.  TimedMeihod:  computes  Step  size  as  a  constant  based 
on  the  number  of  integration  steps  in  the  phase  and  the  total 
time  of  that  phase. 

6.  Rootinghiethod:  computes  step  size  as  a  variable 
based  on  the  number  of  integration  steps  left  in  the 
trajectory  and  the  weapon's  altitude. 

Each  ballistic  weapon's  trajectory  is  divided  into  a 
number  of  phases  depending  on  the  ballistic  weapon  type. 
Associated  with  each  phase  are  methods  implemented  to 
compute 

1.  The  integration  step  sizes 

2.  The  maximum  number  of  integration  steps  for  that 
phase 

The  number  of  phases,  values  of  all  phase  parameters, 
and  the  type  of  each  phase  for  a  particular  weapon  are 
determined  by  analysis  outside  of  the  ballistic  trajectory 
algorithm. 

4.4  Base  classes 

The  DrougedBomb,  StreamLlDedBomb, 
ClusterBomb,  Flares,  Cbu87,  UnguldcdRocket,  and 
TlmeDcpendcntDragBomb  classes  form  the  base  classes 
for  the  various  weapon  types  in  the  BTA  software.  Each  of 
these  classes  implements  methods  that  depend  strictly  on 
that  particular  weapon  type,  and  which  also  are  necessary  to 
adapt  the  algorithm  to  handle  that  particular  weapon  type. 
One  of  these  methods  is  the  resetParameter  method,  which 
resets  the  parameters  (e.g.,  drag  functions,  step-size 
parameters,  and  number  and  size  of  phases)  that  change  when 
choice  of  weapon  changes. 

The  objects  shown  in  circles  (Figure  4),  are  actual 
instances  of  the  various  weapon  classes.  For  example, 
Mk  83  (bomb  with  a  BSU8SB  ballute  retarding  device), 
forward-fixing  aircraft  rocket  (FFAR)  (2.75-inch),  and  B61  a 
bomb  with  a  17-ft  parachute  are  each  instances  of  the 
DrougedBomb,  UnguldcdRocket,  and 
TImeDependentDragBomb  classes,  respectively. 
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5  Summary  and  conclusions 

Our  findings  relate  to  the  reengineered  BTA,  the 
SmallTalk-80  system  and  its  tool  suite,  and  the 
reengineering  process  itself. 

Regarding  the  reengineered  BTA,  we  found  that  the 
present  difficulty  in  restructuring  the  existing  BTA  software 
to  accommodate  significant  modifications  stems  from  the 
tight  coupling  of  code,  data,  and  methodology  as  evidenced 
in  the  ALGO  subroutine  component.  Much  of  the 
complexity  inherent  in  the  BTA  co^  had  been  added  to  the 
relatively  simple  BTA  as  a  result  of  the  constant  handling 
and  testing  of  special  cases  for  individual  werqxMi  types.  The 
light  coupling  ^  code,  data,  and  methodology  contributes  to 
the  relatively  high  (3  man  years  per  year)  maintenance  costs. 

As  we  began  analyzing  the  ALGO  subroutine  code,  one 
consideration  began  to  emerge:  "Would  our  object-based 
approach  lead  to  a  solution  that  is  better  than  that  currently 
provided  by  the  existing  BTA  software  and  at  the  same  time 
fulfill  the  reengineering  expectations?" 

Our  results  thus  far  suggest  that  the  object-based 
l^)proach  offers  potential  for  success.  Overall,  our  object- 
bo^  fhunework  resulted  in  a  redesigned  ALGO  with  br  less 
complexity,  and  commonly  duplicated  blocks  of  code 
eliminated  by  the  establishment  of  reusable  methods,  thus, 
resulting  in  fewer  lines  of  code  to  maintain. 

The  use  of  inheritance  in  our  object-class  structure 
promises  to  provide  considerably  greater  physical 
localization  of  all  special  code  pertaining  to  any  particular 
weapon  type.  The  inheritance  feature  allowed  specif  code  for 
specific  weapon  types  to  be  physically  located  within  the 
weapon  class  rather  than  being  included  as  a  ^tecial  case  in 
the  basic  algorithm  code.  The  result  is  that,  methods  in  the 
weapon  class  are  automatically  used  in  preference  to  the 
general  method  for  the  algorithm,  thus  eliminating  the  need 
for  continual  testing  of  special  cases  in  the  algorithm  code. 
The  use  of  inheritance  allows  the  physical  grouping  of  all 
code  for  a  specific  weapon,  making  future  modifications  or 
even  removal  of  the  code  for  any  weapon  simple  and 
possible  without  the  fear  of  unforeseen  side  effects.  Our 
object-class  structure  for  the  redesigned  ALGO  component 
promotes  the  derivation  of  new  classes  from  working,  tested, 
existing  classes.  The  derived  classes  inherit  existing  methods 
while  extending  and  enhancing  the  base  functionalities,  all 
without  modifying  their  base  classes.  This  extensibility 
reduces  the  effort  required  to  add  new  and/or  modify  the 
existing  functionalities  and,  indeed,  forms  the  core  of  our 
object-based  frameworic. 


Regarding  SmallTalk-80  and  its  tool  suite,  our 
experience  indicates  that  the  benefits  of  using  SmallTalk-80 
to  redesign  an  object-oriented  model  of  the  ALGO 
subroutine  component  of  the  BTA  far  outweigh  the  steep 
learning  curve  experienced  in  the  reengineering  process.  A 
necessary  condition  for  the  effective  maintenance  of  a  system 
is  a  level  of  understanding  of  that  system  sufficiently  mature 
to  support  its  maintenance.  It  is  true  that  our  object-based 
approach  may  lead  to  a  reengineered  BTA  software  that 
exhibits  a  high  degree  of  modularity  and  elasticity. 
However,  it  presents  new  challenges  to  the  maintainers.  The 
maintainers  need  to  understand  the  reengineered  system  to 
maintain  it  effectively.  Inheritance  derived  from  our  object 
classes  and  the  dispersal  of  the  redesigned  system's 
functionalities  in  the  form  of  methods  necessitate  tlw  need 
for  tools  to  assist  in  navigation  through  the  object-class 
structure.  We  feel  that  the  class,  object,  and  method 
browsers  provided  in  SmallTalk-80  fulHU  this  need. 

The  rigid,  object-oriented  discipline  enforced  by  the 
SmallTalk-80  system  resulted  in  a  reverse  engineered  ALGO 
component  that  is  much  simpler  and  promises  to  be  better, 
at  the  least,  in  terms  of  design.  The  availability  of  a  large 
reusable  class  library  of  existing  methods  coupled  with  the 
mature  environment  for  browsing  and  the  incremental 
development  and  testing  of  methods  as  they  are  developed 
was  invaluable. 

Regarding  the  reengineering  process,  one  concern  with 
the  development  of  any  new  approach  to  reengineering 
legacy  code  such  as  that  for  the  BTA  is  how  to  get  the 
current  users  to  accept  the  ideas  and  incorporate  them  into 
their  work.  Our  experience  suggests  that  getting  the  current 
users  involved  in  the  reengineering  effort  is  the  best 
approach,  although  historically  this  has  not  proven 
sufficient  to  overcome  the  inertia  and  pressures  of  the 
moment  encountered  by  the  current  users.  Given  our 
perception  that  the  risks  of  a  reengineered  software  blur  the 
intent  of  the  reengineering  process,  we  suggest  that 
demonstrating  the  reengineering  paradigm  on  (several, 
whenever  possible)  sub-components  of  existing  systems 
allows  current  users  to  become  comfortable  with  the  process 
while  considering  the  end-product  In  particular,  we  contend 
that  demonstrating  the  software  reengineering  process  on 
sub-components  of  existing  OFPs  will  serve  to  alleviate  the 
risks,  both  real  and  perceived. 
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Abstract 

Many  have  studied  the  issues  on  building  systems 
using  the  object-oriented  paradigm.  Others  have 
worked  on  how  to  design  and  implement  hard  real¬ 
time  systems;  i.e.  systems  which  ate  used  to  control 
and  respond  to  real-time  events.  Relatively  few  have 
tried  to  use  the  object-oriented  paradigm  for  design¬ 
ing  and  building  complex  real-time  systems.  In  this 
paper,  we  investigate  the  issues  of  reengineering  hard 
real-time  systems  using  the  object-oriented  paradigm. 
We  propose  an  object-oriented  real-time  model  which 
provides  predictable  scheduling  framework  and  flexi¬ 
ble  performance  real-time  objects.  With  the  model  we 
may  guarantee  the  real-time  capability  in  reengineered 
systems  and  achieve  desirable  performances. 

1  Introduction 

It  has  been  well  recognized  that  real-time  system 
problems  cannot  be  completely  handled  by  fast  pro¬ 
cessors  alone.  Although  high  speed  is  desirable  in 
most  real-time  systems,  high  speed  by  itself  cannot 
solve  all  the  issues  in  real-time  systems.  Real-time 
systems  in  general  must  utilize  their  resources  in  a 
more  intelligent  way,  taking  into  account  of  both  sys¬ 
tem  capacities  and  job  deadlines.  In  addition,  real¬ 
time  systems  must  be  flexible  so  that  they  can  pro¬ 
vide  timely  responses  to  dynamic  real-world  events. 
Resiliency  in  face  of  abnormal  events  is  especially  im¬ 
portant,  since  some  real-time  systems  are  the  sole  con¬ 
trollers  of  safety-critical  applications  and  failures  to 
provide  timely  responses  could  cause  unthinkable  dis¬ 
asters.  Therefore,  real-time  systems  must  be  fast,  pre¬ 
dictable  and  flexible. 

'supported  in  pturt  by  the  Office  of  Naval  Research  under 
grant  N00014-94-1-0034  and  by  the  National  Science  Founda¬ 
tion  under  grant  CCR-89-11773. 


In  this  paper,  we  propose  an  object-oriented 
paradigm  for  reengineering  complex  real-time  systems. 
Using  the  object-oriented  approach  allows  us  to  apply 
the  principles  of  hierarchial  structuring  and  compo¬ 
nent  abstraction,  which  are  essential  in  building  any 
complex  system.  In  addition,  the  object-oriented  ap¬ 
proach  promotes  component  reusability  which  makes 
systems  easier  to  maintain  and  to  modify.  To  meet 
the  strong  requirement  of  graceful  system  degradation 
in  safety-critical  real-time  systems,  we  propose  to  en¬ 
hance  the  adaptability  and  flexibility  of  real-time  ob¬ 
jects.  Finally,  our  proposal  has  a  sound  theoretical 
foundation  on  the  scheduling  model  so  that  real-time 
systems  are  guaranteed  to  have  a  predictable  perfor¬ 
mance. 

Our  contribution  in  this  work  is  that  we  extend 
the  conventional  object-oriented  paradigm  to  include 
the  considerations  for  distributed  and  real-time  appli¬ 
cations.  Our  methodology  has  special  measures  tar¬ 
geted  toward  system  reengineering.  This  is  in  contrast 
to  most  object-oriented  methodologies  that  are  more 
suitable  for  building  systems  from  scratch.  When 
reengineering  a  complex  system,  we  should  reuse  ob¬ 
ject  structures  and  implementations  as  much  as  possi¬ 
ble.  For  this  reason,  most  conventional  object-oriented 
analysis  and  design  methodologies  fail  to  provide  a  sat- 
bfactory  reengineering  solution.  Our  methodology,  on 
the  other  hand,  utilizes  theoretically-sound  real-time 
scheduling  algorithms  as  the  system  design  backbone, 
so  that  the  system  schedulability  can  be  constantly 
monitored  during  the  reengineering  process.  We  also 
apply  the  concepts  of  imprecision  (4)  and  polymor¬ 
phism  [3]  to  the  performance  capability  of  real-time 
objects,  so  that  a  real-time  object  can  provide  different 
performances  under  different  circumstrmces.  We  thus 
believe  that  our  object-oriented  methodology  may  pro¬ 
vide  a  better  reengineering  solution. 
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2  Reengineering  Complex  Real-Time 
Systems 

To  help  systems  make  intelligent  and  flexible  de¬ 
cisions,  real-time  software  must  be  able  to  express, 
to  matntatn  and  even  to  adjust  timing  constraints. 
Conventional  system  design  languages  do  not  specify 
timing  constraints,  as  time  usually  is  not  a  factor  in 
deciding  the  system’s  correctness.  In  real-time  sys¬ 
tems,  the  notion  of  timing  constraint  is  needed  to  trig¬ 
ger  and  to  schedule  real-time  computations.  Express¬ 
ing  timing  constraints  is  thus  an  important  feature  cf 
real-time  software.  To  maintain  timing  constraints, 
scheduling  algorithms  are  used  to  make  decisions  on 
resource  management.  By  carefully  selecting  the  right 
job  (e.g.  job  with  the  earliest  deadline)  to  execute 
first,  a  system  may  satisfy  more  real-time  constraints. 
Finally,  many  external  disturbances  or  interruptions 
may  cause  a  system  to  overload.  Whenever  a  timing 
constraint  cannot  be  satisfied,  the  system  must  uti¬ 
lize  its  built-in  flexibility  to  contain  the  failure  and  to 
adjust  the  actions  and  possibly  the  timing  constraint 
itself.  In  many  applications  which  deal  with  control 
problems,  timing  constraints  can  be  modified  by  tak¬ 
ing  necessary  actions.  For  example,  the  deadline  for 
stopping  a  car  from  hitting  an  obstacle  may  be  ex¬ 
tended  by  slowing  down  the  car  or  by  switching  to  a 
lower  gear. 

When  reengineering  real-time  systems,  another  im¬ 
portant  requirement  is  the  predictability  of  system 
performance,  whenever  the  target  environment  is 
changed  or  the  system  functionality  is  modified.  Con¬ 
ventional  approeudies  on  building  real-time  systems  at¬ 
tempt  to  set  up  systems  so  that  they  may  operat  i>.c- 
ceptably  even  in  the  worst-case  scenario.  To  achieve 
that,  they  are  often  hard-coded  or  hard- wired  for  those 
special  situations.  Such  real-time  systems  are  very 
fragile  and  also  very  difficult  to  reconfigure  both  stat¬ 
ically  and  dynamically,  since  even  a  minor  change  may 
disrupt  a  previously  well-tuned  execution  schedule  and 
cause  some  deadlines  to  be  missed. 

A  better  approach  to  enhance  the  flexibility  of  real¬ 
time  software  is  to  change  the  computation  structure 
so  that  the  amount  of  work  performed  is  based  on 
the  amount  of  time  and  resources  available.  In  other 
words,  instead  of  defining  a  fixed  amount  of  work  to  be 
performed,  we  can  define  a  set  of  workloads  which  are 
candidates  for  execution.  During  run-time,  a  subset 
of  the  workloads  is  executed  using  only  the  amount 
of  time  available.  Some  of  the  workloads  may  have 
dependencies  between  them  when  one  cannot  be  exe¬ 
cuted  until  another  is  finished.  This  defines  the  prece¬ 


dence  constraints.  Workloads  may  also  be  structured 
as  and/or  trees.  Each  branch  of  the  tree  requires  a 
different  amount  of  time  and  resources,  and  zdso  gives 
a  different  degree  of  rewards.  The  system  design  and 
scheduling  issue  is  then  to  select  the  optimal  subset  of 
the  workloads  which  gives  the  best  reward  using  only 
the  available  time  and  resources. 

In  real-time  systems,  the  approach  can  be  imple¬ 
mented  in  several  ways.  A  computation  may  actively 
evaluate  its  timing  constraints  to  select  the  execution 
path  with  the  most  desirable  response  time.  The  run¬ 
time  system,  given  global  scheduling  knowledge,  may 
bind  a  read-time  request  dynamically  to  a  server  with 
appropriate  time  and  resources  available.  Finally,  a 
computation  may  resort  to  producing  imprecise  results 
if  no  feasible  idternative  exists  and  some  response  must 
still  be  generated.  In  all  these  different  implementa¬ 
tions,  the  execution  time  of  a  computation  is  modeled 
as  a  first-class  object  so  that  it  can  be  evaluated  or 
modified  if  necessary.  By  unifying  the  models  of  time, 
resource  and  normal  objects,  the  timing  property  of  a 
real-time  computation  can  be  better  controlled. 

3  Object-Oriented  Paradigm 

In  an  object-oriented  system,  all  entities  in  the 
system  are  defined  as  objects.  An  object  may  in¬ 
voke  methods  defined  in  itself  or  other  objects  for  the 
services  needed,  which  in  turn  may  invoke  methods 
in  other  objects.  Each  object  is  defined  by  a  spe¬ 
cific  class.  Classes  may  form  a  hierarchy  where  some 
classes  inherit  certain  methods  from  their  parent  class. 
Cardelli  and  Wegner  [2]  have  defined  the  three  basic 
required  elements  of  object  orientation  as: 

object  oriented  = 

data  abstraction  +  object  types  -I- 
type  inheritance 

The  definition  of  object  orientation  has  been  ex¬ 
tended  in  [6]  for  distributed  systems: 

object  oriented  = 

encapsulation  +  abstraction  -f- 
polymorphism 

In  the  above  definition,  encapsulation  means  data 
hiding  Md  access  control,  abstraction  means  object 
grouping  according  to  certain  properties,  and  poly¬ 
morphism  means  the  overlapping  and  intersections  of 
object  functions.  Objects  form  a  natural  model  for 
distributed  systems  since  each  object  has  its  natural 
boundary  due  to  the  physical  distribution,  plus  a  well- 
defined  interface  and  also  the  message  passing  facility. 
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In  engineeing  or  reengineering  complex  systems,  two 
issues  must  be  addressed;  interconnectivity  and  in- 
ieroperabtltly.  Distributed  object  structure  provides 
a  nice  framework  in  finding  the  solutions  for  both  is¬ 
sues.  Moreover,  the  object  structure  naturally  accom¬ 
modates  the  heterogeneity  and  automony  in  most  dis¬ 
tributed  systems. 

For  complex  real-time  systems,  the  above  definition 
of  object  orientation  can  be  extended  further  to  cover 
the  requirement  of  predictable  performance: 

Real-Time  object  oriented  = 

encapsulation  -b  polymorphism  -I- 
predictable  performance 

There  are  two  components  in  performance:  the  re¬ 
sponse  time  in  method  execution  and  the  system 
schedule  for  meeting  deadlines.  Both  must  be  pre¬ 
dictable  in  an  object-oriented  real-time  system.  We 
have  discussed  the  predictable  response  time  issue  in 
the  previous  section.  For  the  rest  of  this  paper,  we 
discuss  the  predictable  scheduling  issue. 

When  designing  real-time  systems,  system  schedul¬ 
ing  should  not  be  left  as  the  last  step  in  the  system 
development  process.  We  believe  that  scheduling  con¬ 
siderations  should  be  integrated  into  the  structures  of 
real-time  objects  as  well  as  the  whole  complex  sys¬ 
tem.  Moreover,  it  must  be  considered  early  in  the 
system  design  when  objects  are  being  grouped  and  de¬ 
fined.  For  example,  when  structuring  transportation 
vehicles,  one  can  group  passenger  cars  and  buses  in 
the  same  class  since  both  have  more  than  two  wheels. 
However,  from  performance’s  point  of  view,  it  may  be 
more  appropriate  to  group  motorcycles  and  passenger 
cars  together  since  both  may  provide  a  similar  kind  of 
performance  in  terms  of  speed  and  convenience.  For 
real-time  systems,  the  class  hierarchy  defined  in  terms 
of  performance  capability  may  be  more  appropriate. 

Object-oriented  programming  facilitates  modular 
management  of  resources.  Each  object  provides  its 
users  with  certain  resources  in  the  form  of  methods 
that  constitute  the  interface  of  the  class.  Some  meth¬ 
ods  may  be  provided  by  other  objects;  these  methods 
are  bound  based  on  the  class  hierarchy,  and  some¬ 
times  by  the  parameters  of  the  invocation.  For  ex¬ 
ample,  when  a  ‘-I-’  method  is  requested,  the  actual 
operation  executed  may  depend  on  the  types  of  the 
objects  presented  as  operands  and  as  the  result.  The 
binding  process  and  the  service  provided  may  differ 
for  integer  numbers  and  for  real  numbers.  In  tradi¬ 
tional  object-oriented  systems,  the  actual  realization 
chosen  depends  on  the  claiss  of  the  object  and  parame¬ 
ters,  and  the  class  hierarchy.  Inheritance  allows  classes 


to  be  specified  and  realized  incrementally.  Each  sub¬ 
class  represents  additional  knowledge  about  the  ob¬ 
jects  that  are  instances  of  the  subclass.  The  binding 
of  resources  to  requests  can  sometimes  be  performed 
statically  (by  using  static  typing  and  complete  type 
specification),  although  many  object-oriented  systems 
also  allow  for  dynamic  binding. 

4  Scheduling  Issues  in  Performance 
Polymorphism 

In  real-time  systems,  the  polymorphism  concept 
can  be  generalized  further  to  include  the  execution 
performance  as  one  of  the  binding  parameters.  The 
flexibility  is  required  for  a  number  of  reasons  when 
long-lived  systems  are  being  considered.  First,  sys¬ 
tems  may  be  reconfigured,  and  the  loads  on  the  sys¬ 
tems  may  change.  New  versions  of  a  system  also  may 
be  developed  to  enhance  system  capability.  Moreover, 
new  environmental  constraints,  e.g.  modifications  to 
a  performance  specification,  may  arise  from  time  to 
time. 

The  concept  of  polymorphism  has  been  used  for  dif¬ 
ferent  purposes  in  the  object-oriented  paradigm.  This 
substitution  based  on  architectural  or  performance  cri¬ 
teria  is  a  form  of  polymorphism  that  has  not  been 
considered  in  conventional  systems.  Instead  of  having 
multiple  procedures  that  perform  the  same  action  on 
objects  of  different  type,  we  now  have  multiple  proce¬ 
dures  that  perform  the  same  function  based  on  differ¬ 
ent  environmental  constraints.  This  model  of  perfor¬ 
mance  polymorphism  provides  a  powerful  system  prim¬ 
itive  in  structuring  flexible  real-time  software. 

Formally,  given  a  function  F  that  must  be  per¬ 
formed.  For  this  function,  we  have  several  implemen¬ 
tations  Fi,  F2,  ■  ■ . ,  Ff  that  may  be  chosen.  Choosing 
one  of  these  implementations  makes  a  resource  com¬ 
mitment.  In  the  packing  model  for  system  schedul¬ 
ing,  it  chooses  a  block  of  particular  dimensions,  in 
resources  and  time.  For  the  purposes  of  studying  the 
dimensions  of  the  block,  we  need  consider  only  the  vec¬ 
tor  (ri,  r2, . . . ,  r„,t)  describing  its  resource  and  time 
requirements.  Note  that  at  this  point,  the  time  re¬ 
quirement  need  not  be  distinguished  from  any  other 
resource  requirement;  we  have  reduced  the  problem 
to  an  arbitrary  set  of  coordinates,  one  of  which  hap¬ 
pens  to  be  time.  Without  loss  of  generality,  we 
therefore  describe  the  resource  vector  henceforth  as 
R=  (ri,r2,...,r„). 

Each  of  the  choices  Fj,  therefore,  represents  a  single 
point  Hi  in  this  configuration  space.  The  availability 
of  resources  is  modeled  by  a  set  of  constraint  inequal- 


344 


ities,  which  will  generally  be  of  the  form,  0  <  rj  <  M; 
such  an  inequality  corresponds  to  assertion,  “At  most 
M  units  of  resource  rj  may  be  used.”  (Obviously, 
more  complicated  constraints  are  possible;  they  can 
fit  easily  into  the  scheme.)  If  the  point  satisfies  all 
the  constraints,  then  Fi  may  be  bound  to  the  invo¬ 
cation  F.  If  not,  then  Fi  is  unacceptable,  in  that  it 
consumes  an  excessive  amount  of  some  resource,  and 
we  must  consider  a  different  F. 

It  may  happen  that  several  of  the  Fi ’s  satisfy  all  of 
the  resource  constraints.  To  choose  among  them,  we 
assign  to  each  choice  a  scalar  fi  that  represents  the 
figure  of  merit  associated  with  making  that  choice. 
When  jobs  are  performsmce  polymorphic  computa¬ 
tions,  jobs  do  not  have  a  fixed  amount  of  execution 
time,  but  can  be  executed  for  a  variable  amount  of 
time.  Each  version  defines  a  reward  function  which 
specifies  how  much  reward  can  be  received  for  a  given 
execution  time.  In  some  systems,  we  often  want  to 
evenly  allocate  resources  (especially  CPU  time)  to  jobs 
such  that  all  jobs  have  about  the  same  reward.  In 
other  words,  we  often  want  to  maximize  the  minimum 
reward  for  any  job  in  the  system.  This  is  known  as  the 
knapsack  sharing  problem.  Brown  [1]  has  proposed  an 
eflScient  algorithm  that  requires  O(n^)  operations  for  a 
problem  with  n  continuous  tradeoff  functions,  and  fur¬ 
ther  extended  it  to  handle  the  cases  where  the  tradeoff 
functions  are  piecewise  functions. 

Another  possible  objective  is  concerned  with  the 
allocation  of  resources  to  maximize  the  total  value 
among  all  reward  functions.  It  is  known  as  the  knap¬ 
sack  problem  which  is  NP-compIete  in  general.  Heuris¬ 
tic  algorithms  have  been  used  to  solve  the  knapsack 
problem. 

5  Conclusions 

In  this  paper,  we  present  an  object-oriented 
paradigm  targeted  for  real-time  systems  reengineer¬ 
ing.  The  paradigm  includes  flexible  performance  and 
predictable  scheduling  so  that  real-time  objects  can  be 
reused  easily.  We  are  currently  working  on  the  system 
design  and  implementation  tools  for  the  methodology. 

One  of  the  issues  in  adopting  the  object-oriented 
paradigm  for  system  reengineering  is  that  most  ex¬ 
isting  systems  were  built  by  other  models.  There¬ 
fore,  many  believe  that  it  is  impossible  to  provide  a 
smooth  transition  into  a  new  paradigm.  However,  us¬ 
ing  our  performance  polymorphism  model,  a  new  sys¬ 
tem  could  coexist  with  an  old  one  by  treating  the  old 
components  as  another  candidate  computation  to  pro¬ 
vide  the  required  functionality  but  with  an  alternate 
performance.  In  this  way,  we  believe  our  model  can 


easily  facilitate  system  transition,  reuse  and  integra¬ 
tion. 
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The  Navy's  Next  Generation  Computer 
Resources  Program  (NGCR)  is  now  formally 
five  years  old  and  about  seven  or  eight  in 
conception.  The  author  was  asked  to  review 
the  program  for  straiegir.  direction;  this  paper  is 
adapted  fiom  that  review. 

The  opinions  and  recommendations  are 
those  of  the  author's. 

The  entire  paper  (less  illustrations)  is 
available  for  anonymous  ftp  on 
budden.nps.navy  jnil  in  the  /puh/Necessary 
directory. 


NGCR  program 

The  program  was  organized  in  SPAWAR  as  a 
vehicle  to  adopt,  adapt  and  otherwise  reuse 
commercial  interface  standards  for  mission 
critical  a|q)lications  including  C^I  and  combat 
direction  systems.  NGCR  differs  fiom  other 
standardization  efftxts  within  DoD  in  these 
respects: 

•  focus  is  on  commercial  standards  (FDDI, 
GOSIP,  POSIX,  Futurebus+,  SQL,  PCMT) 

—  difierent  than  the  traditional  Mil-Std 
tgtpioach. 

•  standards  working  groups  include 
industry  particqtation,  some  of  which  is 
program-funded.  Most  Navy  labs  are  involved. 
Participation  from  other  services  has  been 
welcome  fiom  the  beginning. 

•  strategy  is  to  target  standards  as  they  are 
formed  within  industry  rather  than  adoption 
after  they  are  finished. 

Assessment  Methodology 

A  pyramid  model  was  used  to  judge  the  NGCR 
program.  This  model  was  arrived  at  by 
viewing  the  evolution  of  tactical  decision 
support  systems  as  they  are  evolving  from 
purpose-built  stovepipe  constructions  to  a 
more  general  purpose,  sustainable  form.  The 


Joint  Maritime  Command  Information  System 
(JMCIS)  evolution  is  a  good  example. 

Unification  thread 

The  unifying  thread  that  runs  through  this 
taxonomy  is  how  each  layer  treats  an  atom  of 

rtara- 

•  how  the  communications  system  treats  a 
packet  or  message. 

•  and  passes  this  chunk  of  data  to  the 
operating  system's  file  system, 

•  how  Ae  database  management  system 
organizes  the  data  into  tables  and  relations, 

•  how  a  correlator  fuses  different  atoms  of 
data  into  molecules  of  information,  and 

•  how  this  information  is  tfiqrlayed  and 
operated  on,  and 

•  how  the  necessary  security  attributes 
remain  attached  to  the  data  at  stage  in  the 
process. 

Understanding  of  this  thread  allows  us  to 
cut  to  the  core  of  the  both  interoperability  and 
multi-level  security  problems  and  identify 
those  truly  central  issues.  It  also  allows  us  to 
relegate  several  standards  issues  to  'important, 
but  secondary'  status. 


Figure  1 .  Anatomy  of  Decision  Support 
Systems 
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Figure  1  illustrates  our  building-block 
anatomy  ^ . 

•  Layer  1 .  All  decision  support  systems 
have  a  means  of  communications  in  order  to 
receive  incoming  data  and  to  transmit  the 
decision  maker's  orders. 

■  Layer  2.  Computer  systems  require  an 
Operating  System  which  schedules  tasks  for 
the  central  processing  unit,  organizes  and 
maintains  a  file  system,  and  controls 
communications  input/output. 

•Layers.  The  central  function  of  a 
decision  support  system  is  to  file  data  in  a 
database  and  then  extract  appropriate  views  of 
the  database  for  the  user.  In  our  anatomy 
illustration,  the  layers  under  the  database 
management  system  are  concerned  with  feeding 
data  into  the  database.  Those  layers  over  the 
DataBase  Management  System  ^BMS)  are 
tasked  with  prqiaring  the  data  and  presenting  a 
picture  to  the  decision  maker. 

•Layer  4.  The  Graphical  User  Interface 
(GUI)  is  the  means  for  making  bits  into 
graphics.  When  creating  a  tactical  picture  in 
the  mind  of  a  decision  maker,  graptdcs  are  far 
more  effective  at  human  communications  than 
the  written  word  so  a  powerful  GUI  is  a  pretty 
essential  requirement  in  nearly  all  C3I  decision 
support  systems. 

The  Correlator  is  the  process  that  fuses,  or 
links,  chunks  of  data  in  the  database  to  each 
other.  This  is  the  essential  process  of 
transforming  data  into  information. 

•  Layer  5.  Finally,  the  Tactical  Decision 
Aids  are  those  processes  that  help  the  decision 
maker  operate  on  the  information  and  produce 
decisions.  These  aids  allow  operators  to 
visualize  'what  if  or  they  extrapolate  existing 
trends  and  events  into  the  future.  Tactical 
decision  aids  are  those  which  suggest  resources 
to  dispatch  to  a  certain  incident  based  on 
knowledge  of  resources,  their  capabilities  and 
positions,  probabilities  of  incidents  and  other 
factors. 

If  NGCR  is  doing  its  standards  job 
correctly,  then  there  is  something  to  ftll  in 
each  block  and  the  set  of  standards  hang 
together  as  a  whole. 


^  Resemblances  between  this  description  and  the 
Common  Operating  Environment,  DODDS 
Technical  Reference  Model  and  the  DISA 
Technical  Reference  Model  are  somewhat  more 
than  coincidenta’ 


What's  Right 

The  NGCR  Operating  System  Standards 
Working  Group  and  Database  Standards 
Working  Group  appear  to  be  squarely  targeting 
the  respective  layers. 

Additionally,  the  Program  Support 
Environment  Working  Group  should  provide 
useful  help  for  the  Decision  Aids  area. 

In  each  case,  the  assessment  is  on  the 
relevancy  of  the  NGCR  wwk,  not  on  its 
quality  as  that  seems  to  be  uniformly  good 
throughout. 

The  Backplane  Working  Group  suffers  for 
irrelevancy  reasons  which  become  clear  only 
when  we  proceed  to  the  second  part  of  the  ptqjer 
on  models. 

The  SAFENET  working  group  did  a  ffne 
job  in  the  local  area  networking  standards  but 
suffers  greatly  from  lack  of  scope. 

What's  Missing 

Three  areas  are  of  particular  concern  in  the 
assessment  of  NG^'s  standards  adoption, 
adaptation  and  influence  effort  Two  deal  with 
the  scope  of  existing  NGCR  work: 

•  inadequate  scope  in  the  networking 
standards  work, 

•  scant  attention  to  the  critical  problem  of 
software  recycling  in  the  correlator  and  decision 
aids  areas  where  COTS  software  products  will 
only  be  of  limited  value. 

The  third  problem  is  the  observation  that 
even  if  the  NGCR  program  places  ideal 
standards  into  the  Navy  inventory,  these  tools 
will  be  of  limited  use  without  some  serious 
attention  to  the  test  of  the  information  systems 
architecture  problem.  This  modularization  half 
of  the  architecture  problem  is  the  subject  of  the 
second  half  of  this  paper. 

Correlator  and  Decision  Aid 
shortcomings 

The  correlator  and  tactical  decision  aids 
modules  cannot  be  defmed  by  use  of 
commercial  standards  and  implementations  are 
likely  to  have  high  proportions  of  govemment- 
ovm^  software  indefinitely.  Because  the  Navy 
will  carry  the  software  maintenance  burden,  it 
is  vitally  important  that  the  Navy  modularize 
this  software  properly  to  gain  economies 
through  software  recycling  and  sharing  of 
modules  amongst  programs. 

Far  more  important  than  the  simple 
importation  of  COTS  products,  this  part  of  the 
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modularization  problem  is  vital  to  controlling 
software  costs  and  maximizing  a  sustainable 
combat  capability  that  our  sailors  can  actually 
use. 

The  correlator  is  the  module  that  links 
different  chunks  of  data  in  the  database 
together,  thereby  transforming  the  data  into 
information.  The  correlator  is  (^pendent  on  the 
sensors  and  the  nature  of  the  data  to  be  fused. 
So  we  need  different  correlators  for  different 
sets  of  sensors. 


lanalH  <I<M 
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Figure  2.  Correlator 

The  tactical  decision  aids  are  highly 
dependent  on  the  specific  C3I  requiremenu. 
Some  of  those  that  a  destroyer  needs  may  not 
be  tqrpropriate  for  a  CinC's  command  cento* 
ashore.  And  the  decision  aids  needed  by  a 
destroyer  engaged  in  support  of  a  littor^ 
peacekeeping  operation  are  different  than  those 
needed  by  a  destroyer  protecting  a  convoy.  But 
the  library  of  tactical  decision  aids  has  room 
enough  for  all,  and  there  still  will  be  a  lot  of 
overlap,  so  we  don't  necessary  need  wholly 
different  pieces  of  software.  Indeed  a  multi¬ 
platform,  multi-mission  decision  support 
software  product  is  entirely  practical. 

The  JMCIS  Darwinian  consolidation 
tactic,  with  a  common  library  of  decision  aids, 
may  indeed  prove  to  be  a  sound  t^iproach. 
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Figure  3.  JMCIS  program  development 
path 

The  commercial  information  systems 
industry  does  seem  to  be  producing  useful 
standard  for  this  problem.  It  appears  that  the 
Distributed  Computing  Environment  (DCE)  as 
produced  by  the  Open  Systems  Foundation 
(OSF)  is  highly  applicable  to  enabling  the  kind 
of  software  interchangability  that  the  Navy 
cannot  afford  to  be  without 

Computer  language  design  that  allows 
sqraration  of  module  declarations  and 
implementations  is  also  highly  useful.  This 
feature  is  common  to  Modu^-2,  C,  and  Ada 
amongst  other  languages  as  diis  feature  enables 
the  ability  to  distribute  modules  across  clients 
and  servers  on  a  network  using  remote 
procedure  calls. 

Networking  shortcomings 


A  somewhat  mote  egregious  NGCR 
shortcoming  is  the  Navy's  failure  to  come  to 
grips  with  its  multiple  stovepipe 
conununications  lii^  problem.  There  is  no 
apparent  analog  to  the  JMCIS  Figure  3  above 
for  Navy  communications  systems: 
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Figure  4.  Convergence  (??)  of  Navy 
Communications 


This  shortcoming  is  critical:  as  programs 
develop,  requirements  become  better  understood 
and  articulated.  And  requirements  change.  And 
the  need  for  the  information  systems  of 
different  systems  to  communicate  with  each 
other  becomes  mote  evident  All  these 
requirements  shortcomings  have  reasonably 
straightforward  growth  path  solutions  if  the 
network  is  done  right  On  the  other  hand,  if 
the  communications  is  done  wrong,  this 
evolutitxiary  growth  path  is  forever  tortuous 
and  stunted. 

A  further  goad  should  be  the  rapid 
evolution  in  radio-based  communications  from 
a  situation  where  DoD  owns  the  connectivity 
to  one  where  the  military  will  simply  purch^ 
the  services  from  commercial  vendors.  The 
rapid  rise  in  commercial  satellite  connectivity 
platuied  for  the  test  of  this  decade  indicates  that 
the  Navy-owned  circuits,  and  Navy-unique 
protocols,  will  rapidly  become  un^fordable 
relics  of  die  past 

For  the  communications  layer,  the  Navy's 
needs  and  the  commercially  used  reference 
model  —  the  ISO  Reference  Model  —  arc 
highly  coincident  The  internet  approach  —  a 
complex  interconnection  of  networks  of 
networks,  yielding  a  highly  survivable 
substrate  that  shares  the  scarce  bandwidth 
resource  —  is  ideal  for  a  Navy. 


scope  is  inadequate  for  terrestrial  and  radio 
bas^  Wide  Ar^  Networks  (WANs)  and  for 
wireless  LANs.  SAFENETs  scope  and  means 
also  did  not  adequately  cover  upper  layer 
{X’oiocols  which  provide  critic^  logical 
interfaces  for  the  operating  system,  database 
management  system  and  decision  aid  layers. 


Figure  6.  Internet  Connectivity 
Development 

Missing  is  the  Navy's  influence  in  Frame 
Relay  and  Asynchronous  Transfer  Mode  (ATM) 
standards  development  ATM  appears  destined 
to  be  the  protocol  of  choice  in  both  terrestrial 
WANs  and  radio  WANs  and  DoD  will  probably 
have  to  use  the  protocol  as  industry  provides  it. 

Missing  also  is  Navy  influence  in  the 
IEEE  802.1 1  committee  which  is  attempting 
to  produce  standards  for  wireless  LANs  which 
have  great  potential  in  military  applications. 

And  missing  is  Navy  und^tanding  and 
influence  of  the  upper  layer  protocol 
develtpment,  particularly  in  the  areas  of  secure 
electronic  messaging. 

Correcting  these  sins  of  omission  would 
provide  the  basis  for  completing  Figure  4 
above. 


Figure  5.  Internetwork  reference  model 

SAFENET  does  a  fine  job  within  this  context 
—  for  local  area  networks.  But  SAFENETs 
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Modularization  —  Part  the 
Second 


In  the  first  half  of  this  pt^r,  we  presented  a 
'reference  model'  or  a  layered  architecture 
approach  to  the  NGCR  standards  effort  In  this 
second  part,  we  address  a  m«e  concrete 
problem,  more  typical  of  those  that  conh'ont 
program  managers  producing  information 
systems^. 


Figure  7.  A  Concrete  Model  of  Information 
Systems 

Sense-Decide-Act 

Let's  use  this  Sense-Decide-Act  model  as  our 
modularization  guide.  A  ^ical  information 
system  is  made  up  of  end  systems  that  include: 

•  sensors  or  data  collection  nodes.  These 
range  from  radars  to  human  lookouts  or  a  ship, 
to  humans  cs^turing  data  and  filling  in  a  form. 

•  decision  siqrport  nodes.  These  nodes  are 
in  the  business  of  transforming  data  into 
information  into  knowledge  into  wisdom.  The 
term  'decision  support'  is  used  deliberately, 
connoting  that  the  human  decision  maker 
remains  an  integral  part  of  most  systems. 

•  action  nodes  which  carry  out  the 
decisions  made  in  the  decision  support 
function. 

•  a  network  that  connects  the  three  kinds  of 
nodes  above  together. 

The  interface  definitions  that  get  codified 
as  standards  are  those  between  the  modules. 


purist  would  point  out  that  the  modularization 
notion  —  the  reason  for  interface  definitions  — 
should  proceed  the  standards  which  are  the  means 
for  articulating  the  interfaces.  Indeed,  the  two 
must  go  together  and  DoD,  by  concentrating  on 
standards  without  the  acompanying  models  has 
the  chicken  without  the  egg. 


Complexity 

Complex  information  systems  may  ftnd  these 
models  either: 

•  nested  inside  each  other.  A  weapon  may 
be  the  Act  node  in  a  combat  direction  system. 
It  may  also  have  an  entire  Sense-Decide-Act 
view  within  itself,  particularly  if  it  has  some 
autonomy  (i.e.  fire  &  forget)  requirements. 

•  ch^ed  together.  The  output  of  one 
Sense-Decide-Act  sequence  may  be  the  input  of 
another.  What  one  information  system  views 
as  'action'  may  be  providing  the  'sensory'  data 
to  another. 

The  model  also  is  neutral  regarding 
numbers  if  we  do  the  modularization  job 
correctly.  A  decision  support  system  will 
usually  have  many  sensors  feeding  it; 
conversely  a  sensor  may  feed  multiple  decision 
support  nodes. 

Communications  substrate 

These  end  systems  (Sense,  Decide  and  Act)  all 
are  attached  to  a  communications  substrate  that 
carries  the  output  of  one  node  to  the  input  of 
the  succeeding.  Use  of  an  internetwork^ 
communications  substrate  again  renders  the 
netwcHk  neutral  regarding  numbers  —  any 
number  of  sensors,  decision  siqrport  iKxles  and 
action  entities  can  be  attached  to  the  network 
providing  it  is  sized  appropriately. 


Figure  B.  Network  substrate 

One  of  the  advantages  of  current 
internetwork  technology  is  that  this  capacity 
problem  is  fairly  easily  assuaged  —  it  is 
usually  quite  practical  to  increase  the  bandwidth 
of  the  network  arbitrarily  and  incrementally. 
Without  requiring  a  system  rebuild,  and 
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transparently  to  the  end  systems  attached  to 
it3. 

Abstract  to  concrete 


The  following  chart  is  offered  as  a  proposed 
allocation  of  function  amongst  the  different 
modules. 
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Figures.  Function  Mapping 

The  next  few  sections  are  amplifying 
discussion  to  the  chart 


Sensor  Module 

GUI  and  decision  aids.  For 
integration  purposes,  a  sensor  need  not  contain 
any  user  interfaces  and  hence  no  GUI.  In  a 
practical  sense,  however,  most  sensors  have 
some  sort  of  comtnand/display  unit  allowing 
autonomous  local  operation;  there  is  no 
objection  to  this.  Local  operation  often  means 
that  decision  aids  also  reside  in  the  sensor, 
again,  there  is  no  objection,  but  the  issue  is 
irrelevant  to  the  larger  systems  [noblem. 

Correlation.  The  sensor  should  contain 
a  complete  data  reduction  capability  so  that  an 
updated  sensor  can  be  added  to  the  system 
without  also  requiring  upgrades  within  the 
decision  support  system. 

Conversely,  the  multiple  sensor 
integration  functions  should  reside  wholly 
within  the  decision  support  system  so  that 


^The  appropriate  truism  here  is  that  amateurs  talk 
about  capacity  while  professionals  discuss 
availability. 


improved  cc»Telation  algorithms  can  be  inserted 
in  new  software  releases  without  requiring 
sensor  changes.  Avoid  multiple-sensor 
integration  functions  within  the  sensors 
themselves. 

Data  management.  The  critical  item  is 
that  the  sensor  must  provide  data  in  a  format 
suitable  for  direct  insertion  in  the  decision 
support  system's  database.  A  data 
standardization  problem. 

Most  sensors  will  contain  some  kind  of 
operating  system  (and  pertuqjs  a  DBMS). 

Many  sensors  lack  the  complexity  needed  for  a 
gen^  purpose  operating  system  like  Unix. 

In  most  cases,  the  OS  used  is  up  to  the 
supplying  vendor  —  since  it  is  wholly 
contained  within  the  sensor  module  and  the 
supported  code  is  the  vendor's  —  not  the 
government's  —  responsibility,  this  is  quite 
accq;)table. 

Communications.  Each  end  system, 
including  sensors  and  the  DBMSs  within  the 
decision  support  system,  should  contain  a 
messaging  interface.  The  ubiquitous  choice 
should  be  an  electronic  mail  one.  If  additional 
options  are  needed  for  qrecial  purposes,  they 
can  be  added  but  any  ad^tions  should  be  done 
without  compromising  the  common 
denominator. 

End  systems  should  generaUy  interface 
only  to  a  LAN  and  leave  wide  area 
communications  (both  terrestrial  and  radio- 
based)  as  a  function  wholly  encapsulated  within 
the  network  module. 

End  systems,  especially  senscxs,  need  a 
management  interface.  Management 
information  can  be  categorized  into: 

•  get-info  (querying  a  sensor), 

•  set-info  (controlling  the  sensor  and  its 
configuration), 

•  traps  (the  sensor  initiating  messages 
about  itself). 

Given  the  state  of  networic  marugement 
technology,  it  makes  the  most  program 
development  and  logistics  sense  to  extend  the 
networii  management  information  bases  to 
include  the  end  systems,  particularly  as  the 
network  management  protocols  have  been 

designed  to  allow  this^. 


^If  the  commmucations  interface  is  built,  as 
described,  then  the  backplane  within  any  sensors 
is  irrelevant.  The  backplane  is  of  concern  for 
logistics  and  sensor  upgrade,  but  not 
interoperability  reasons.  For  this  reason,  we 
judge  the  NGCR  Backplane  Working  Group  with 
high  marks  for  execution,  but  low  ones  for 
relevancy. 
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Action  entities 

Those  instruments  effecting  actions  upon  the 
environment,  whether  weapons  or  some  other 
actuator,  need  a  defmed  interface  with  the 
decision  support  system,  but  it  can  be 
somewhat  sparser. 

All  of  the  functions  above  the  network  are 
either  vacuous  or  wholly  contained  within  the 
action  entity  and  therefore  not  germane  to  our 
interoperability  problem  with  the  exceptions  of 
the  data  elements  which  must  be  standardized. 

The  network  interface  should  again  be  a 
LAN  connection,  a  messaging  interface  and  a 
management  information  base. 

Decision  support  systems 

The  decision  support  system  is  likely  to  be  the 
most  complex  m^ule  and  is  likely  to  contain 
most  of  the  functions.  Decision  aids,  a  GUI, 
the  DBMS,  and  a  complex  operating  system 
will  all  be  found  here. 

Correlator.  The  partitioning  of  the  data 
fusion  fimction  as  noted  in  the  sensor 
paragraph  should  persist  This  is  a  key 
division  in  maintaining  an  open  system:  it's 
one  easily  transgressed  and  hard  to  fix 
retroactiv.ily.  But  if  the  data  reduction  function 
remain',  in  the  sensor  and  the  data  fusion 
function  stays  wholly  in  the  decision  support 
syst.:m,  both  can  evolve  independently  and 
inrrementally. 

Network  interface.  The  same 
n  lodularization  guidance:  a  messaging  interface, 
a  LAN  physical  interface  and  a  management 
i  iformation  base  should  all  be  part  of  the 
I  etwork  interface. 

Networking  substrate 


network  to  transport  data  and  not  this  network 
to  uansport  track  data,  that  network  to 
Uansport  intelligence  data  and  the  other 
network  to  transport  EW  data. 

This  modularization  allows: 

•  incremental  additions  to  the  network  to 
extend  connectivity,  increase  capacity  or 
improve  availability  (robustness)  of  the 
network  without  affecting  any  of  the  attached 
end  systems, 

•  allows  the  end  systems  to  evolve 
indqiendent  of  network  developments. 

Installed  base 

Updates  to  the  installed  base  are  often 
implemented  with  little  thought  to  gaining  or 
maintaining  the  modularity  described^.  One  of 
the  most  persistent  sins  is  to  violate  the 
correlation  partitions  —  rather  than  update  a 
sensor  and  equip  it  with  improved  data 
reduction  processing  horsepower,  the  path  of 
least  resistance  may  be  to  try  to  sandwich  that 
improved  rtota  reduction  into  the  next  release  of 
the  decision  support  software. 

Over  the  long  term,  such  expediencies  cost 
more  than  they  save.  Tlus  is  especially  true 
when  a  sensor  is  recycled  and  fe^  data  to 
multiple  decision  support  systems  —  a 
conunon  case  with  qieciali:^  military 
equipment 

A  common  understanding  of  how  Navy 
information  systems  are  modularized  will  help 
multiple,  often  rather  indqiendent.  component 
managers  produce  and  maintain  components 
that  can  be  assembled  into  usable  systems. 

Recommendation:  Every  time  a  radar 
or  other  device  is  field  changed  and  every  time  a 
piece  of  decision  siqtport  software  is  upgraded 
to  a  new  release,  this  incremental  improvement 
should  include  progress  toward  the  partitioning 
of  functions  as  outlined  here. 


Tl  e  networking  structure  that  connects  all  the 
at  ove  end  systems  should  contain  the  local  area 
ar  d  wide  area  networking  segments  required,  a 
management  information  to  maintain 
control  of  the  network,  and  robust  messaging 
support 

The  network  should  not  have  any 
requirements  in  any  of  the  other  layers  as  far  as 
the  specific  infcxmation  system  is  concerned 
(these  elements  may  well  appeal  internally  to 
the  network  as  part  of  the  netwoik  management 
function).  The  network  function  should  be 
confined  to  transporting  data  and  the  network 
should  not  be  lit^ed  to  the  content  of  any 
particular  data.  In  other  words,  we  use  the 


These  things  we  must  fix 

The  Navy  requires  an  authority  with 
architectural  responsibility.  Included: 

•  defining  between-program  interfaces  (e.g. 
between  Aegis  and  NTCS-A), 

•  defining  the  interfaces  between  the 
Copernicus  Pillars  (which  will  place  a  clear 
modular  separation  between  decision  support 
systems  and  an  underlying  network  substrate). 


^The  software  industry  is  well  aware  of  the  effects 
of  the  Second  Law  of  Thermodynamics;  but  it 
happens  to  hardware  and  to  modularization  too. 
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■  focus  on  a  Top  Half  Dozen'  programs. 

This  focus  on  aichitecuire  should  be 
distinct  from  the  programmatic  and  budget 
bureaucracy  because  it  all  too  easily  gets 
subordinate  to  that  exercise.  This  is 
particularly  true  when  the  architect  is 
physically  located  in  the  same  command  as 
program  managers,  such  as  a  SysCom. 
Additionally,  locating  this  authority  within  a 
panicubx  SysCom  is  insufficient  as 
information  systems  are  produced  by  all  of 
them. 

The  scope  must  extend  to  at  least  the 
combat  direction  system,  C^I  system,  and 
management  information  system  development 
communities.  These  unions  tend  to  view 
themselves  as  isolated  from  the  others  but: 

•  each  is  dealing  with  the  same 
information  system  architecture  proUems, 

•  information  systems  created  and  evolved 
by  each  community  must  interoperate  with 
those  of  the  others. 

Authority 

The  authority  must  be  seen  as  one  improving 
the  Navy's  information  systems  capability  (and 
thereby,  its  combat  capability)  and  not  as 
another  stumbling  block  on  a  program 
manager's  critical  path: 

•  The  authority  must  manage  an 
education  program,  particularly  of  program 
persoimel,  in  open  systems,  layered 
architectures,  and  appropriate  modularization  of 
systems.  This  education  should  focus  on 
problem  analysis  (education}  rather  than 
solution  building  (too  often  training).  The 
education  ^ould  be  career-oriented  —  not  a 
one-time  deal. 

Note  that  Fm  using  the  term  education  as 
distinct  from  training^.  A  training  approach 
would  be  to  say  'here's  a  standard,  go  use  it' 

An  educational  approach  would  provide  the 
background  to  outline  requirements  and  then 
select  an  appropriate  modularization  and  set  of 
standards  to  define  the  module  boundaries. 

The  target  audience  must  include  senior 
uniformed  and  civilian  personnel  on  the  N-staff 
and  in  the  SysComs.  It  must  also  include,  in 
greater  depth,  mid-grade  personnel  in  the 
SysComs  and  engineering  personnel  at  the 


Navy's  labs^.  And  if  the  Navy  intends  to  be 
serious  about  cooperation  with  industry,  this 
educational  opportunity  must  be  made  available 
to  contractor  and  potential  contractor  personnel. 

The  education  approach  should  be  applied 
throughout  an  officer  or  civil  servant's  career. 

The  executive  agent  must  have 
management  influence  over  an  education  budget 
and  educational  infrastructure  to  cany  out  this 
duty. 

•  Technical  intelligence.  The  Navy 
too  often  reinvents  things  available  COTS 
because  it  is  unaware  that  solutions  already 
exist  in  the  commercial  sector.  The  Navy 
needs  a  KGB  —  in  the  Andropov  incarnation 
—  to  keep  both  it,  and  the  program  managers 
it  services,  current  in  our  galloping 
information  technology. 

Such  an  intelligence  system  need  not  be 
clandestine  or  in  any  way  covert  Open  sources 
will  do  nicely. 

Simple  intelligence  gathering  is 
insufficient;  an  ability  to  evaluate,  fuse,  and 
effectively  disseminate  finished  intelligence  in 
a  manner  reqxxisive  to  the  customer's  needs  is 
vital.  The  intelligence  conununity  working  in 
the  national  security  arena  has  excellent  tools 
and  methodology,  ^ong  with  a  good 
understanding  of  what  fmished,  evaluated, 
intelligence  means  —  we  need  to  apply  that 
methodology  here  to  our  own  natioiial 
technology  base. 

•  Program  planning.  Many  of  the 
architectural  sins  in  Navy  programs  are 
commioed  before  the  Tentative  Operational 
Requirement  is  ever  signed  out  of  the 
Pentagon.  Many  more  of  them  are  committed 
by  the  time  the  Navy's  labs  get  their  tasking. 
Like  software  engineering,  errors  committed 
the  earliest  are  the  hardest  and  most  expensive 
toconecL  Therefore,  the  architectural 
authority  must  be  privy  to  program  planning  at 
its  earliest  stages  —  that  is  the  most  effective 
and  least  painful  place  to  outline  an  open 
systems  pattern  to  build  the  system  to. 

The  authority  should  apply  the  limited 
resources  at  his  disposal  to  a  half  dozen 
programs  that  we  judge  as  key  to  the  Navy. 
This  schwerepunkt  approach  says  that  if  these 
half  dozen  programs  are  all  delivoed  as  open, 
evolvable  systems  based  on  the  same  model, 
then  the  rest  of  the  Navy's  information  systems 
will  come  along  of  their  own  accord.  (This 


^As  a  data  point.  Naval  Postgraduate  School 
provides  a  one-time  shot  (no  continuing 
education)  that  is  limited  almost  entirely  to 
uniformed  naval  officers  at  the  Lt/Lcdr  level. 


^Webster;  education  is  a  systematic  study  of 
problems,  methods,  and  theories  ....  Training  is 
instruction  to  make  proficient  or  qualified. 
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focused  approach  does  not,  of  course,  prohibit 
on-request  help  to  other  programs). 

An  informal  poll  suggests  the  following 
as  a  Top  Half  Dozen: 

•  AEGIS  Combat  System  Mk7  —  fleet 
and  littoral  area  air  defense 

•  ACDS  Model  5  and  NTU/ACDS  Model 
4  block  0  —  combat  direction  systems 

•  CSSA:2P/Unk  16  —  the  radio- WAN 

•  BSY-l/BSY-2  —  submarine  combat 
direction  system 

•  JOTS/GOTS/NTCS-A/JMCIS  — 
decision  supptHt  system 

•  OSS  —  decision  support  system 

Conclusion 

Admiral  Nelson  and  the  Royal  Navy  achieved 
mastery  of  the  seas  when  Nelson  tlrcw  the 
doctrinaire  Rghting  Instructions  overboard  and 
rqrlaced  them  with  a  shared  corporate  ethk  and 
understandiitg  of  how  to  defeat  the  enemy. 
Open  systems  cannot  be  achieved  by  a 
dhecdve-oriented,  standards-only  tqrproach.  We 
must  build  —  through  an  education  process 
—  a  shared  understanding  of  how  program 
components  are  modulari^  if  we  wish  to 
achieve  a  buUding-blodt  open  systems 
architecture. 

Please  note  that  we  are  dealing  with  a 
dramatically  changed  enviroiunent  &om  twenty 
years  ago  in  information  systems  —  the  bulk 
of  the  technology  is  dual  use.  If  the  US  Navy 
doesn't  do  this  first,  die  same  technology  is 
available  to  other  navies. 

Recommendation  1  —  organizational 
commitment  Someone  in  the  Navy  needs  to 
take  charge  of  the  Navy's  information  system 
architecture.  A  simple  program  advocating 
standards  is  not  enough. 

Recommendation  2  —  education  of 
our  people.  The  Navy  needs  a  strong 
educational  program  that  teaches  program 
management  personnel  about  how  to  analyze, 
and  modularize  information  systems  problems. 

Recommendation  3  —  the  Navy  needs 
to  adopt  a  modularization  scheme  common  to 
all  information  systems  expressly  to  fosto’  the 
[uoduction  of  recyclable  building  blocks. 
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Reuse-based  Reengineering: 
Notes  From  the  Underground 


Frank  Svoboda 

Unisys  Government  Systems  Group 
Reston,  Va. 


Abstract 

This  paper  describes  the  reengioeering  process  and 
lessons  teamed  to-date  for  the  Army/STARS/Unisys  De- 
moosliadoo  Project  (the  Demo  Project)  Application 
Engineering  Team  (AET).  The  Demo  Project  combines 
elements  of  reengineering  and  reuse  dirough  simultaneous 
Domain  Engineering  (domain  modeling  and  asset' 
creation)  and  Application  Engineering  (reengineering, 
maintenance,  and  new  developmem). 

Although  traditionally  assuming  die  passive  role  of 
consumer  for  Domain  Engineering  (DE)  products.  Applica¬ 
tion  Engineering  (AE)  can  actively  support  reuse  (reuse- 
based  reengineering).  The  concurrait  nature  of  die  Demo 
ftojecfs  DE/AE  work  affords  unique  possibilities  for 
sharing  of  information  and  modeling  techniques  diat  might 
not  odierwise  occur  when  the  two  activities  are  distributed 
across  space  and  time.  The  AET  process  defined  in  diis 
paper  addresses  issues  related  to  systematized  reuse  during 
reengineering  and  continuing  software  maintenance. 

Introduction 

Megaprogramming  is  die  STARS  vision  of  process- 
driven,  domain-specific  reuse- based,  technology -sqiprated 
systems  development  The  purpose  of  die  Army/STARS/- 
Unisys  Demo  Project  is  to  show  the  benefits  of  megapro- 
gtamming  in  an  Intelligence/Electronic  Warfare  (lEW)  en- 


‘  An  artifact  is  data  existing  on  physical  media  that 
(xmveys  information  about  a  software  system.  A  work-product 
is  a  planned  artifact.  An  asset  is  an  artifact  of  potential  value 
within  a  given  domain  (i.e.,  an  area  of  related  knowledge  or  ac¬ 
tivity).  Assets  have  either  been  developed  for  reuse,  reengi¬ 
neered  for  reuse,  or  certified  for  reuse  as-is. 


viroomenL  To  this  end,  die  Demo  Project  enacts  simulta¬ 
neous  Domain  Engineering  and  Application  Engineering 
cycles  to  create  and  utilize  software  assets  for  reengineer¬ 
ing  and  subsequent  maintenance  phases.  A  fundamental 
difference  between  diese  modes  of  operation  is  diat  Do¬ 
main  Engineering  identifies  applicable  systems  for  a  given 
domain,  while  Afplication  Engineering  identiries  applica¬ 
ble  domains  for  a  given  system. 

Figure  1  depicts  die  traditional  ndes  of  Domain 
Engineering  and  Application  Engineering.  This  model 
inqilies  dut  Domain  Engineering  and  Application  Engi¬ 
neering  are  tightly  ooiqiled,  executing  in  a  strictly 
sequential  manner. 


Figure  1:  Tradition  Domairi/Application 
Engineering  Roies 


The  Domain  Engineering  (DET)  selected  an  lEW 
sub-domain  —  Emitter  Location  Processing  and  Analysis 
(ELPA)  —  as  its  domain  of  focus  (DOF).  The  DOF  pro¬ 
vides  the  scope  for  domain  modeling  and  asset  creation. 
The  AET,  in  turn,  uses  assets  developed  by  the  DET  for 
reengineering  within  the  DOF.  Figure  2  depicts  die  Demo 
Project's  model  of  concmrent  Domain  Engineering  and 
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Application  Engineering.  Here,  AE  can  operate  indepen¬ 
dently  of  DE  in  non-DOF  areas,  coordinating  at  asset 
hand-off. 


Time 


Figure  2:  Concurrent  Oomain/Application 
Engineering 


Additionally,  die  AET  has  chosen  a  broader 
reengineering  scope  that  enonapasscs  interfacing 
sub-systems  and  non-contiguous  application  areas  as  well. 
Within  diis  broader  scope,  the  AET  is  reengineering  not 
only  fOT  near-term  maintainability,  but  also  to  capture 
tradeoffs,  decisions,  and  underlying  rationales.  Providing 
dus  information  about  non-DOF  areas  to  future  Domain 
Engineering  completes  the  DE/AE  cycle.  It  is  envisioned 
that  domain  engineering  and  maintenance  will  eventually 
be  integrated,  so  that  the  asset  base  will  continue  to  evolve 
dirough  successive  maintenance  iteraticHis. 


Application  Engineering  Goals 


The  AETs  prime  objective  is  the  reengineering  of  a 
target  lEW  system  —  Improved  Guardrail  V  GGRV)  — 
to  improve  maintainability,  flexibility,  and  evolvability  (the 
ability  of  a  system  to  suffer  modificaticm).  Incorporating 
the  STARS  megaprogramming  vision  into  this  effort  ex¬ 
pands  these  objectives  to  include  the  following  goals  that 
support  systematic  reuse  for  subsequent  maintenance 
cycles: 


a.  the  definition  and  enactment  of  a  repeatable  S/W 
maintenance  process  supported  by  an  integrated 
software  engineering  environment  (SEE) 

b.  the  utilization  and  assessment  of  software  assets 

c.  sharing  of  information  and  modeling  techmques 
with  the  Domain  Engineering  Team 


At  the  time  this  paper  was  drafted,  the  AET  had 
completed  Reverse  Engineering  of  a  selected  non-DOF 
IGRV  subsystem.  The  sequence  of  activities  for  this  Re¬ 
engineering  (diase  and  the  coiKeptual  underpinnings  of  the 
AET  Reengineering  process  are  described  below. 

AET  Reengineering  Process 

llie  components  of  the  AET  Reengineering  process  are 
Reverse  Engineering,  Improvement,  Restructuring,  and 
Forward  Engineering  [1].  We  distinguish  maintenance  as 
"modification  of  a  system's  fimctionalitf  from 
reengineering  as  "modification  of  a  system's  form."  In 
frill-cycle  reengineering,  requirements  serve  as  the  linchpin 
between  a  legacy  system  and  its  reengineered  counterpart. 
In  this  case,  the  first  significant  issues  addressed  during 
Forward  Engineering  are  those  relating  to  architecture. 
The  identification  and  adoptitm  of  standard  architectures 
are  important  compcments  of  Reengineering  between  the 
life-cycle  phases  of  Requirements  and  Design. 


Figure  3:  Reengineering  Process  Framework 


Figure  3  (adapted  from  Byme  [2]  and  Chi kofsky/ Cross 
[3])  depicts  the  continuum  of  activities  that  comprise  the 
AETs  Reengineering  process.  As  indicated,  abstraction 
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is  the  usderlying  mechanism  for  Reverse  Engineering, 
while  refinement  serves  the  same  purpose  for  Forward  En¬ 
gineering.  The  Reverse  Engineering  and  Forward  Engi¬ 
neering  {^ses  are  "minor  images"  of  each  other,  progres¬ 
sing  backwards  and  forward,  respectively,  through  the 
life-cycle  steps  of  Requirements,  Architecture,  Design,  and 
Implementatioa.  The  AET  Reengineering  approach 
inoxporates  improvement  aixl  restructuring  within  the 
body  of  the  Forward  Engineering  phase.  Separate 
activities  for  Reverse  Engineering  ate  given  in  the 
following  text 

Reverse  Engineering  Process 

The  AET  Reverse  Engineering  process  consists  of 
activities  that  represent  IGRV  artifacts  at  increasing  levels 
of  abstraction.  This  process  requites  techniques  that 
progress  backwards  conceptually  from  relatively  concrete 
code  artifacts  to  more  abstract  requirements  artifacts. 

The  goal  of  Reverse  Engineering  is  to  model  and  un¬ 
derstand  an  existing  (as-is)  system.  Since  Reverse  Engi¬ 
neering  is  also  pan  of  Domain  Eogineeriqg,  both  the  DET 
and  die  AET  can  exploit  diis  similarity  through  similar 
representations,  tools,  and  processes.  The  scope  of 
Reverse  Engineering  will  vary  between  :eanis,  however; 
Application  Engineering  models  a  single  system,  while 
Domain  Engineering  modek  commonality  and  variation 
across  multiple  systems  in  a  given  domain. 

The  first  step  in  defining  our  reengineoing  process  om- 
sisted  of  defining  a  (xocedural  model  diat  applies  to  each 
life-cycle  phase.  The  Reverse  Engineering  Process 
Template  below  describes  this  model,  which  incorporates 
Byrne's  Reverse  Engineering  Procedure  [4]  and  the  Plan- 
Enact-Leam  Cycle  of  the  CFRP  [5].  The  Reverse  Engi¬ 
neering  Process  Scenario  is  an  expansion  of  the  template 
across  the  entire  life-cycle.  The  Demo  Project  used  diis 
scenario  to  qiecify  tool  use  and  team  interaction  at  given 
process  steps. 

Reverse  Engineering  Process  Template 

For  each  life-cycle  step  backwards  from  code  through 
requirements,  the  following  steps  are  performed; 

a.  Collect,  organize,  and  understand  system  artifacts. 


"Hard"  artifacts  such  as  code  and  supporting 
documentation  are  gathered  and  organized  to  ease 
retrieval.  System  knowledge  and  informal  feature 
descriptions  are  captured  as  artifacts. 

b.  Analyze  and  model  system  artifacts.  Determine 
grouping  criteria.  Interpret  artifacts  and  groiqi  into 
higher-level  abstractions.  Artifact  models,  when 
recorded,  become  the  artifact  irqxit  for  die  next 
life-cycle  step  in  die  reverse  engineering  process 
(e.g.,  Sffucture  Charts,  as  code  models,  typically 
rqiresent  die  functional  deconqwsition  design). 

c.  Evaluate  the  artifact  models.  Determine  evaluation 
criteria  and  evaluate  die  artifact  models  against 
diese.  For  examine,  if  Structure  charts  derived 
during  code  modeling  do  not  convey  any  signifi¬ 
cant  information,  then  it  may  be  desirable  to  elim¬ 
inate  selected  portions  from  the  overall  model. 

d.  Consolidate  the  artifact  models.  Reconcile  the 
artifact  models  with  any  existing  artifacts  at  the 
same  level.  For  example,  after  creating  Structure 
Charts,  examine  any  legacy  design  artifacts  and 
combine  these  to  produce  a  complete,  accurate,  and 
understandable  set 

Reverse  Engineering  Process  Scenario: 

I.  Preparatitm 

A.  Perform  iine-of-business  analysis.  Define  and  scope 
reengineering  effort  Set  reengineering  context  and 
objectives. 

B.  Collect  and  catalog  legacy  system  artifacts.  This 
includes  baselining  code,  design,  architecture,  and 
requirements  models,  trouble  reports,  test  cases  and 
procedures,  and  supporting  documentaticm. 

C.  Interview  domain  experts  and  record  system 
knowledge. 

D.  Interview  users  and  record  informal  features  of  the 
system. 

n.  Code  -»  Design 

A.  Collect  code  artifacts  and  organize  according  to 
reverse  engineering  needs  (e.g.,  map  task/function 
to  directories/subdirectories).  Derive  and  capmre 
understanding  of  the  artifacts. 
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B.  Identify  criteria  for  design  partitioning  and  group 
code  modules  as  design  units.  Record  module 
invocation  and  data  coupling  among  code  modules 
(e.g.,  via  Structure  Charts,  aoss-reference,  etc.). 

C.  Create  representation  of  internal  module  processing 
and  data  access  using  textual  description,  tables, 
an<Vor  graphic  notatioa 

D.  Update  code  documentation,  including  headers, 
embedded  commentary,  and  other  supporting 
documents. 

E.  Generate  Data  Dictionary.  Perftmn  balancing  and 
consistency  diecks  on  dau  dictionary. 

F.  Determine  evaluation  criteria  for  design  artifacts. 

G.  Evaluate  design  artifacts  to  determine  usability, 
accuracy,  and  coverage.  Hold  internal  reviews  and 
walk-throughs.  Record  results  and  recommended 
actions  foe  rework. 

H.  Consolidate  design  work-products  with  legacy 
design  artifacts. 

ni.  Design  -*  Architecture 

A.  Collect  design  artifacts  and  organize  according  to 
user  needs  (e.g.,  map  task/fimetion  to  directorie^- 
subdirectories).  Derive  and  capture  understanding 
of  die  artifacts. 

B.  Identify  criteria  for  architectural  partitioning  and 
group  code  modules  as  architectural  units  that 
describe  functionality,  connections,  and  interface 
mechanisms.  Create  representation  of  architectural 
units  using  textual  description,  tables,  and/or 
graphic  notarial. 

C.  Update  design  documentation. 

D.  Update  data  dictionary.  Perform  balancing  and 
consistency  checks  on  data  dictionary. 

E.  Determine  evaluation  criteria  for  architecture 
artifacts. 

F.  Evaluate  architecture  artifacts.  Hold  internal 
reviews  and  walk-throughs.  Record  results  and 
recommended  actiois  fa  rework. 

G.  Consolidate  architecture  wak-products  with  legacy 
architecture  artifacts. 

IV.  Architecture  -*  Requirements 

A.  Collect  architecture  and  test  artifacts  and  organize 


according  to  user  needs.  Assess  informal  features 
(implicit  requirements)  through  user  interviews. 
Derive  and  capture  understanding  of  the  artifacts. 
(Note;  siiKe  legacy  test  artifacts  [e  g.,  plans,  proce¬ 
dures,  cases,  results]  were  used  to  validate/verify 
the  original  system,  these  artifacts  may  provide  in¬ 
sight  into  system  requirements.) 

B.  Identify  criteria  fa  requirements  model  panitioning 
and  group  architectural  units  into  data  transforma¬ 
tions  at  control  transformations  (data/control  flow 
diagrams).  Identify  external  data  sources/sinks. 

C.  Identify  state/mode  actions,  events,  and  triggering 
mechanisms  (behavioral  modeling).  Depict  diese 
via  textual  description,  tables,  a  gnq^c  notation 
(e.g..  State  Transition  Diagrams,  Statecharts,  Event 
Diagrams,  etc.). 

D.  Depict  data  aganization  via  textual  description, 
tables,  or  grai^ic  notation  (Data  Structure 
Diagrams,  Entity-Relationship  Diagrams). 

E.  Update  architecture  documentation. 

F.  Update  data  dictioiary.  Perform  balancing  and 
consistency  diecks  on  data  dictionary. 

G.  Determine  evaluation  criteria  fa  tequiremeats 
model  artifacts. 

H.  Evaluate  requirements  model  artifacts.  Hold 
internal  reviews  and  walk-throughs.  Recad  results 
and  recommended  actions  fa  rewak. 

I.  Coisolidate  requirements  model  wak-products 
with  legacy  artifacts,  recorded  donain  knowledge, 
and  informal  features. 

J.  Update  legacy  requirements  documentation. 

Lessons  Learned 

The  following  items  informally  describe  sane  lessons 
learned  during  the  enactment  of  the  Reverse  Engineering 
process. 

•  Choose  common  terms  (domain  lexicon)  and  agree  on 
definitions  (glossary).  When  tenns  are  adopted  that 
take  on  a  specific  meaning,  it  is  of  critical  importance 
to  communicate  the  meaning  of  those  terms  to  all 
stakeholders.  This  step  should  be  performed  early  and 
often.  A  set  of  canonical  domain  terms  should  be 
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defined  and  shared.  Continuing  efforts  should  feed 
back  information  into  the  lexicon. 

•  There  is  no  substitute  for  domaiiVsystem  expertise.  The 
Demo  Project  was  blessed  with  readily  available 
domaii\/system  experts  and  informatioa  In  assessing 
artifact  quality,  in  modeling  system  functionality,  and 
in  identifying  implicit  requirements,  expert  help  greatly 
enhanced  productivity. 

As  with  typical  legacy  systems,  additional  IGRV 
informatioD  sources  included  code  and  requirements 
documentatioa  cnly.  Cuttent  executable  code  and  its 
oofiesponding  source  code  as  well  as  accurate  code 
documentation  (headers,  embedded  commentary)  were 
still  intact  Neidier  design  nor  ardiitecture  artifacts 
existed.  Requirements  artifacts  included  die  system 
requirements  specification  (of  questionable  accuracy) 
aiid  the  system  operator's  manual. 

•  Closer  interaction  between  the  DET  and  AET  was  anti¬ 
cipated  dian  what  occurred.  Due  to  different  goals  (the 
DET  needed  to  model  commonality  and  variation  of 
multiple  systems  widiin  die  same  domain;  the  AET 
needed  to  model  multiple  subsystems  widiin  a  single 
application),  different  tools  were  chosen  and  interaction 
diverged.  Certain  efforts  have  benefited  from 
aoss-fertilization.  Design  modeling  of  some  low-level 
units  was  shared  between  die  DET  and  AET.  The  DET 
architectural  modeling  group  performed  reverse  engi¬ 
neering  using  representations  to  those  used  during  AET 
Reverse  Engineering. 

Use  of  similar  tools  and  representations  eases 
sharing  of  information  across  Domain  Engineering  and 
Af^lication  Engineering  teams;  different  tools  inhibit 
sharing.  Bridge  utilities  (i.e.,  those  diat  enable  informa¬ 
tion  sharing  between  tools)  can  help  nanow  the  gap  be¬ 
tween  tools,  if  the  underlying  representations  are  con¬ 
gruent  to  each  odier. 

•  Some  typical  reengineering  steps  did  not  prove  as 
effective  as  originally  thought.  Structure  Charts  were 
not  as  useful  to  architectural  grouping  as  originally 
thought.  Architectural  grouping  criteria  were  extracted 
from  requirements  and  supporting  documentation  and 


confirmed  with  domain  experts.  For  a  better  designed 
system,  the  Structure  Chart  -»  Architectural  unit  pro¬ 
gression  may  have  been  easier.  The  AET  may  use 
Structure  Charts  created  during  Reverse  Engineering  to 
identify  and  reduce  module  complexity  during  Forward 
Engineering. 

•  The  AET  did  not  choose  to  perfmn  ccxnplexity 
analysis  on  legacy  code.  Although  this  analysis  could 
identify  maintenance  problem  areas,  the  AETs 
decisions  on  which  areas  to  reengineer  were  based 
more  on  user  needs  and  economic  factors. 

•  By  looking  ahead  at  requirments  as  a  goal,  the  AET 
used  legacy  requirements  as  heuristics  for  determining 
architectural  and  requirements  grouping  oiteria.  This 
technique  enhanced  productivity,  but  tended  to  inhibit 
traceability  between  life-cycle  work-products. 

•  Dealing  with  scoping  and  boundary  issues  is  an 
important  part  of  Reengineering  context  setting. 
Although  intuition  suggests  that  coarse-grained 
architectural  improvements  may  provide  the  best  long¬ 
term  return  rm  investment,  available  resources  may 
dictate  an  incremental  tpproach.  The  scope  of  Reverse 
Engineering  may  differ  from  die  scqie  of  Fca^ard 
Engineering. 

•  The  following  were  identified  as  sources  of 
requirements: 

a.  "Hard"  requirements  from  legacy  documentaticai 

b.  Reverse-engineered  requirements 

c.  Informal  features  -  implicit  requirements  from  a 
user's  point-of-view;  at  a  minimum,  a  reengineered 
system  must  provide  the  capabilities  of  its 
predecessor. 
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Conclusion 

For  any  engineering  effort,  reuse  and  reusability  should  be 
embraced  as  first  principles.  New  development  should 
occur  only  as  a  last  resort.  Opportunities  to  share 
knowledge  should  be  planned  for  and  explored  at  every 
possible  step.  As  Domain  Engineering  becomes  m<ve 
accepted,  tools,  processes,  and  training  will  evolve  to 
support  coordinated  DE/AE  efforts.  At  a  human  siqtport 
level,  our  experience  has  shown  the  value  of  commonality 
in  language,  concepts,  and  representation.  Common 
understanding  is  at  the  cote  of  reusability.  Through  closer 
interaction  and  more  effective  communication  among 
systeip/domain  stakeholders,  we  can  evolve  our 
engineering  practices  and  improve  the  quality  and 
adaptability  of  tire  software-intensive  systems  we  develop. 
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problems,  a  number  of  technology  solutions  have 
sprung  up  under  a  variety  of  labels,  including  reengi¬ 
neering,  reuse,  recycling,  modernization,  renovation, 
reconstitution,  reverse  engineering,  design  recovery, 
redocumentation,  respecification,  redesign,  restruc¬ 
turing,  and  retargeting.  For  a  summary  of  software  re¬ 
engineering  technology,  the  reader  is  referred  to 
[Arnold  93]. 

1.1  Definition 


Abstract:  This  paper  discusses  a  plan  that 
addresses  how  the  Software  Engineering  Institute 
(SEI)  may  assist  the  Department  of  Defense  (DoD)  in 
reengineering  Its  large  software-intensive  systems. 
This  plan  is  based  on  a  view  of  reengineering  as  an 
engineering  problem  to  improve  the  cost-effective 
evolution  of  large  software-intensive  systems.  This 
view  of  reengineering,  which  takes  the  whole 
software  engineering  process  into  account,  fosters  a 
growth  path  by  leveraging  promising  emerging 
software  engineering  technologies.  The  paper  also 
highlights  the  results  of  an  October,  1993  workshop 
conducted  by  the  SEI,  and  discusses  how  the 
workshop  themes  relate  to  the  issues  discussed  in  the 
paper. 


1  Introduction 

In  the  last  few  years,  the  world  has  realized  that  the 
number  of  large  systems  being  built  from  scratch  is 
rapidly  diminishing  while  the  number  of  legacy  sys¬ 
tems  in  use  is  very  high.  New  system  capabilities  are 
created  by  combining  existing  systems.  At  the  same 
time,  the  context  in  which  these  systems  have  been 
built  has  changed.  Changes  range  from  changes  in 
the  application  environment  in  which  these  systems 
operate  (e.g.,  new  sensors)  to  changes  in  hardware 
and  software  technologies  (e.g.,  dramatic  increases 
in  processor  speed  and  memory,  high-level  languag¬ 
es,  improved  methods).  Some  of  the  technologies 
used  when  these  systems  were  built  can  hinder  the 
system's  ability  to  evolve  to  meet  ever-changing  de¬ 
mands  in  a  cost-effective  way.  As  a  result  of  these 


In  this  paper  we  are  building  on  Chikofsky’s  work  on 
a  taxonomy  [Chikofsky  90],  the  results  of  the  First 
Software  Reengineering  Workshop  of  the  Joint  Log!> 
tics  Commanders  Joint  Policy  Coordinating  Group  on 
Computer  Resources  Management  [Santa  Barbara 
92],  as  well  as  insights  from  ARPA  sponsored  work  in¬ 
cluding  STARS,  DSSA  and  from  European  efforts 
sponsored  under  the  auspices  of  ESPRIT,  Eureka 
(Eureka  Software  Factory),  and  the  Institute  for  Sys¬ 
tems  and  Software  Technology  of  the  Frauenhofer 
Gesellschaft[ISST92]. 

Definitions  for  reengineering  found  in  the  literature  in¬ 
clude: 

•  the  examination  and  alteration  of  an  existing 
system  to  reconstitute  it  into  a  new  form  and  the 
subsequent  implementation  of  the  new  form; 

•  the  process  of  adapting  an  existing  system  to 
changes  in  its  environment  or  technology  without 
changing  its  overall  functionality; 

•  modification  and  possible  further  develop¬ 
ment  of  an  existing  system; 


1.  The  Software  Engineering  Institute  is  sponsored  by  the  U.S.  Department  of  Defense.  This  report  was  funded  by  the  U.S. 
Department  of  Defense 
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•  improvement  of  a  system  through  reverse  en¬ 
gineering  (and  restructuring)  followed  by  forward 
engineering. 

Figure  1-1  illustrates  a  taxonomy  of  terms  related  to 
reengineering  by  Chikofsky.  In  this  commonly-accept¬ 
ed  taxonomy,  software  system  abstractions  are  rep¬ 
resented  in  terms  of  life-cycle  phases.  Shown  are 
requirements,  design,  and  implementation.  The  tradi¬ 
tional  process  of  developing  a  system  by  creating 
these  abstractions  is  referred  to  as  forward  engineer- 
hg.  Reverse  er}gineering  is  the  process  of  analyzing 
an  existing  system;  identifying  system  components, 
abstractions,  and  interrelationships;  and  creating  the 
respective  representations.  Redocumentation  and 
design  recovery  are  two  forms  of  reverse  engineer¬ 
ing.  Redocumentation  refers  to  the  creation  and  revi¬ 
sion  of  representations  at  the  same  level  of 
abstraction,  while  design  recovery  refers  to  the  utiliza¬ 
tion  of  external  information  including  domain  knowl¬ 
edge  in  addition  to  observations  of  the  existing 
system  to  identify  meaningful  higher  levels  of  abstrac¬ 
tion.  The  third  process  component  of  reengineering  is 
restructuring.  Restructuring  is  the  transformation  of 
representations  at  the  same  level  of  abstraction  while 
preserving  the  system's  external  behavior.  Reengi¬ 
neering  is  an  engineering  process  to  reconstitute  an 


existing  system  into  a  new  form  through  a  combina¬ 
tion  of  reverse  engineering,  restructuring,  and  fonvard 
engineering. 

Reengineering  relates  closely  to  maintenance,  which 
is  generally  viewed  as  consisting  of  corrective,  per¬ 
fective,  preventive,  and  adaptive  maintenance.  Ac¬ 
cording  to  ANSI/IEEE  Std  729-1983,  software 
maintenance  is  the  “modification  of  a  software  prod¬ 
uct  after  delivery  to  correct  faults,  to  improve  perfor¬ 
mance  or  other  attributes,  or  to  adapt  the  product  to  a 
changed  environment.”  In  this  paper  we  use  the  term 
system  evolution  to  include  software  maintenance. 

For  the  purposes  of  this  paper,  we  take  an  encom¬ 
passing  view  of  reengineering  as  addressing  the  en¬ 
gineering  problem  of  (improving)  cost-effective 
evolution  of  large  software  intensive  systems,  both 
existing  and  future,  through  appropriate  application  of 
effective  best-practice  engineering  methods  and 
tools.  Evolution  of  many  existing  systems  is  consid¬ 
ered  as  not  being  cost-effective  and  cannot  keep 
pace  with  changes  in  the  application  (domain)  envi¬ 
ronment  and  changes  in  the  computing  environment 
and  software  engineering  technology.  The  term  lega¬ 
ry  system  has  been  attached  to  systems  with  such 
characteristics.  Changes  in  the  application  environ¬ 
ment  (the  external  environment  the  application  sys¬ 
tem  operates  in)  as  well  as  in  the  implementation 
environment  (the  hardware/software  platform)  have 


i  Reverse  Engineering 
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Figure  1-1;  Common  View  of  Reengineering 
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to  be  assumed  as  a  given  and  have  to  be  accommo¬ 
dated  {engineering  for  change).  This  need  for  engi¬ 
neering  for  change  applies  to  both  existing  systems 
and  new  (or  future)  systems. 

1.2  Context 

The  focus  of  this  paper  is  on  technical  aspects  of  re¬ 
engineering.  However,  economic,  management,  and 
acquisition  aspects  play  as  important  a  role  in  the  suc¬ 
cessful  improvement  of  the  capability  to  reengineer 
legacy  systems. 

The  cost  of  inaemental  change  to  a  legacy  system 
needs  to  be  reduced.  Criteria  for  deciding  on  the  need 
for  reengineering  range  from  heuristics  such  as  age  of 
code  and  excessive  maintenance  personnel  training 
cost  (as  found  in  a  1983  NIST  document)  to  parame¬ 
terized  cost  models  (see  [ISST  92,  Santa  Barbara 
92]).  Improvement  in  this  cost  is  anticipated  by  invest¬ 
ing  more  than  the  minimal  amount  into  reflecting  the 
requested  change.  The  additional  investment  would 
go  into  improving  the  way  the  system  has  been  engi¬ 
neered  with  the  result  of  smaller  incremental  cost  in 
the  future.  If  several  legacy  systems  have  to  be  re¬ 
engineered,  their  similarities  can  be  captured  in  a 
common  reusable  architecture,  treating  them  as  a 
family  of  systems  rather  than  isolated  point  solutions. 
The  cost  models  for  reengineering,  together  with  bet¬ 
ter  understanding  of  the  effectiveness  of  different  en¬ 
gineering  techniques,  will  allow  software  engineers  to 
make  reasonable  engineering  tradeoffs  as  they 
choose  a  particular  evolutionary  reengineering  strate¬ 
gy  for  a  legacy  system. 

Engineering  effectiveness  is  influenced  by  how  well 
an  organization  is  able  to  manage  its  engineering  pro¬ 
cess  and  improve  its  engineering  capability.  SEI  has 
provided  leadership  for  government  and  industry  to 
improve  these  organizational  software  process  capa¬ 
bilities  through  work  on  the  Capability  Maturity  Model 
(CMM)  and  its  use  as  an  assessment  and  improve¬ 
ment  tool.  In  the  context  of  this  paper  we  assume  that 
the  reader  understands  the  relevance  of  such  capa¬ 
bilities  for  an  organization’s  ability  to  systematically, 
efficiently,  and  effectively  reengineer  legacy  systems. 


Successful  improvement  of  legacy  systems  through 
reengineering  also  requires  attention  to  improve¬ 
ments  in  the  acquisition  process  and  to  legal  con¬ 
cerns.  The  Joint  Logistics  Commanders  Joint  Policy 
Coordinating  Group  on  Computer  Resources  Man¬ 
agement  is  holding  a  workshop  series  to  address  ac¬ 
quisition  issues  at  the  policy  level.  For  further 
discussion  of  these  and  other  inhibitors  to  successful 
transition  of  improved  software  engineering  practice 
see  the  work  done  on  transition  models  by  SEI  and 
others  [Przybylinski  91 ;  Leonard-Barton  88]. 
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2  A  Reengineering 
Framework 


the  context  (e.g.,  technical  changes  such  as  promis¬ 
ing  new  technologies  or  economic  changes  such  as 
budget  reductions  or  increases). 


In  this  paper  we  have  cast  reengineering  as  an  engi¬ 
neering  problem.  Problem  solving  involves  an  under¬ 
standing  of  the  problem,  i.e.,  a  clear  understanding  of 
the  root  causes  in  terms  of  its  existing  state,  an  under¬ 
standing  of  the  desired  state,  and  a  path  (plan)  to 
evolve  from  the  current  state  to  the  desired  state.  Fig¬ 
ure  2-1  illustrates  this.  The  current  state  reflects  prop¬ 
erties  of  the  existing  system  and  the  process  by  which 
the  system  is  engineered  (developed  and  main¬ 
tained).  A  subset  of  those  properties  is  undesirable, 
reflecting  the  problem  to  be  solved.  System  under¬ 
standing  reflects  the  process  of  creating  and  main¬ 
taining  an  understanding  of  a  system  (through 
analysis,  elicitation,  and  capture).  System  evolution 
represents  the  engineering  activity  of  migrating  the 
existing  system  to  the  desired  state.  Based  on  an  un¬ 
derstanding  of  the  current  and  desired  system  state 
and  available  (re)engineering  technology,  an  analysis 
making  engineering  tradeoffs  by  considering  techni¬ 
cal,  management,  and  economic  risks  and  con¬ 
straints  results  in  a  (re)engineering  plan.  During  the 
execution  of  this  plan  (i.e.,  the  actual  evolution  of  the 
system  through  engineering  activity),  the  plans  may 
be  reassessed  taking  into  consideration  changes  in 


2.1  The  Current  System  State 

The  root  causes  for  the  lack  of  cost-effective  evolution 
fall  into  two  categories:  management  of  the  engineer¬ 
ing  process  and  the  engineering  process  itself.  Man¬ 
agement  of  the  engineering  process  is  addressed  by 
SEi's  work  on  CMM  and  will  not  be  elaborated  here. 
The  second  category  represents  technology  root 
causes,  i.e.,  the  engineering  process,  methods,  and 
tools.  It  will  be  the  focus  of  further  discussion. 

The  technology  root  causes  manifest  themselves  in  a 
number  of  ways.  Some  examples  are: 

•  Data  structures  not  cleanly  implemented.  As¬ 
sumptions  that  a  specific  element  of  shared 
memory  (e.g.,  Fortran  COMMON)  is  used  as  the 
communication  mechanism. 

•  System  representations  such  as  architectural 
and  design  descriptions  reflecting  the  application 
domain  and  the  implementation  approach  may 
never  have  been  created  or  documented;  the 
documentation  (and  sometimes  even  the  source 
code)  is  out  of  date. 


Figure  2-1:  Reengineering  Problem  Solving 


•  Assumptions  about  the  application  environ¬ 
ment  have  been  hardcoded  in  the  implementa¬ 
tion.  Examples  include  assuming  a  point  solution 
including  fixed  number  and  types  of  real-world 
objects. 

•  The  computing  environment  evolved  through 
several  generations.  For  example,  early  hard¬ 
ware  platforms  were  memory-limited,  resulting  in 
a  number  of  sometimes  (in  today's  view)  convo¬ 
luted  implementation  "tricks,''  such  as  overlay, 
instruction  reuse,  and  cryptic  user  interaction. 

No  operating  system  support  was  assumed.  To- 
da/s  computing  environments  typically  consist 
of  COTS  standard  operating  systems,  DBMS, 
window  systems,  and  networking  support,  and 
are  geared  toward  a  high  degree  of  interactive¬ 
ness  and  "user-friendliness." 

•  The  implementation  technology  has  evolved 
from  machine  code  with  absolute  addressing;  to 
symbolic  assembler,  high-level  algorithmic  lan¬ 
guages  (COBOL,  FORTRAN,  ALGOL);  to  lan¬ 
guages  supporting  data  abstraction,  modularity, 
information  hiding,  concurrency  support,  data 
modeling  capabilities,  etc.  Design  and  imple¬ 
mentation  methods  have  been  coming  and  go¬ 
ing,  each  leaving  its  trademark  in  the  code  of  leg¬ 
acy  systems.  This  code  may  or  may  not  accom¬ 
modate  the  changes  demanded  from  systems 
today. 

Legacy  systems  also  have  a  number  of  properties 
that  are  worth  preserving.  Examples  include: 

•  Legacy  systems  are  deployed  and  have  un¬ 
dergone  the  scrutiny  of  real  users  with  respect  to 
their  functionality  meeting  their  real  needs. 

•  Nonfunctional  properties  such  as  perfor¬ 
mance  and  accuracy  have  been  fine-tuned. 

•  Corrective  maintenance  has  resulted  in 
"hardened"  code  and  a  wealth  of  test  and  valida¬ 
tion  capabilities. 

•  System  history  exists  in  the  form  of  original 
designers,  current  and  past  maintainers,  as  well 
as  bug  report  and  change  order  records. 

In  many  cases  some  of  the  root  causes  and  their  im¬ 
plications  may  be  understood  by  some  experts,  but 
are  not  documented  and  available  to  the  majority  of 
software  engineers.  Information  about  systems  is 
quite  limited,  usually  to  the  source  code  and/or  exe¬ 
cutable,  an  operations  manual,  and  people  maintain¬ 
ing  the  system. 


2.2  The  Desired  System  State 

The  desired  system  state  is  a  combination  of  proper¬ 
ties  of  the  existing  system  to  be  maintained,  proper¬ 
ties  expected  of  a  system  as  part  of  state-of-the-art 
software  engineering  practice  and  implementation 
technology,  and  properties  that  have  their  roots  in 
changing  environments  and  are  reflected  in  the  sys¬ 
tem  history,  but  may  not  have  been  explicitly  ex¬ 
pressed  by  the  system  user.  Examples  of  maintained 
properties  are  functionality,  performance,  and  accura¬ 
cy.  Examples  of  properties  resulting  from  best  prac¬ 
tice  software  engineering  and  implementation 
technology  include  portability,  modularity,  structure, 
readability,  testability,  data  independence,  document¬ 
ed  system  understanding,  openness  (open  system), 
interoperability,  and  seamless  integration.  Properties 
that  address  continuous  change  and  provide  flexibility 
include  localization  of  information  regarding  certain 
different  types  of  change  in  both  the  application  do¬ 
main  and  the  implementation,  introduction  of  virtual 
machine  abstractions,  and  parameterization  (dynam¬ 
ic  as  well  as  generation  technology),  COTS,  and  re¬ 
use  of  components.  Properties  that  encourage  reuse 
of  existing  engineering  know-how  include  the  exist¬ 
ence  of  domain  models,  domain-independent  soft¬ 
ware  architectural  principles,  domain-specific 
architectures,  and  adaptable  components. 

The  desired  system  state  may  be  known  to  system 
users,  system  maintainers,  original  system  builders, 
and  best  software  engineering  practice  experts.  The 
customer  (user)  may  not  necessarily  be  aware  of  all 
the  potentially  desired  properties  and  may  only  be 
willing  and  able  to  invest  in  some.  Some  desired  prop¬ 
erties  can  be  provided  with  proven  technology,  while 
others  depend  on  emerging  technology  whose  matu¬ 
rity  for  practical  application  has  not  been  demonstrat¬ 
ed. 

2.3  System  Understanding 

The  current  state  of  an  existing  system  and  its  desired 
state  represent  an  understanding  of  the  system.  This 
understanding  is  based  on  artifacts  of  the  existing 
system;  knowledge  and  experience  with  the  system 
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as  it  may  exist  in  users',  maintainers',  and  original  that  document  system  representations  at  different 
builders'  heads;  and  documented  system  history  in  levels  of  abstraction.  This  is  complemented  with  ratio- 

the  form  of  bug  reports  and  change  records.  Figure  2*  nale  for  design  decisions,  the  software  engineering 

2  illustrates  the  sources  of  information  for  system  un-  process  and  methods  used,  and  the  evolution  history, 
derstanding.  The  artifacts  are  source  code,  manuals.  Let  us  first  elaborate  on  models  of  (software)  sys- 
and  the  executing  system.  The  knowledge  and  expe-  terns, 
rience  with  the  system  include  understanding  of  engi¬ 
neering  decisions,  rationale,  and  possible  or 
considered  alternatives,  as  well  as  undocumented 
history  and  (typically  nonfunctional)  properties  such 
as  performance,  robustness,  work-arounds,  etc.  His¬ 
tory  provides  insight  into  robustness  of  system  com¬ 
ponents,  types,  and  frequency  of  changes  in  the 

environment  (and  implementation). _ 

System  Artifacts  System  Experte  System  History 

Source  code  Developers  Error  log 

Manuals  Maintainers  Change  orders 

Running  system  Users 


Rgure  2-2:  Creating  System  Understanding 


Capture,  representation,  currency,  and  accessibility 
of  this  system  understanding  is  a  big  challenge.  Fig¬ 
ure  2-3  illustrates  a  framework  for  representation  of 
such  system  understanding.  A  central  component  of 
system  understanding  is  the  system  design  records 
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These  representations  are  mod^s  of  the  system. 
Models  reflect  views  of  the  system  focusing  on  certain 
aspects  with  different  degrees  of  detail.  The  purpose 
of  a  model  is  to  present  a  view  that  is  understandable, 
I.e..  not  too  complex.  This  is  accomplished  by  the 
model  capturing  those  abstractions  that  are  relevant 
from  a  particular  perspective.  Some  models  focus  on 
architectural  issues  while  other  rTK>dels  focus  on  data 
representation,  behavioral,  reliability  and  perfor¬ 
mance  aspects  of  a  system.  Examples  of  models  are 
domain  models,  domain-specific  architectures,  real¬ 
time  timing  models  such  as  rate  monotonic  analysis 
(RMA),  performance  nKxlels  based  on  queuing  theo¬ 
ry,  etc. 

Models  have  different  degrees  of  formality  and  may 
have  the  ability  to  be  executed.  The  models  may  re¬ 
flect  designs  (i.e.,  the  notation  they  are  expressed  in 
needs  to  be  transformed  into  executable  implementa¬ 
tions),  or  they  may  be  executable  and  capture  ail  the 
desired  user  functionality  and  can  act  as  prototype  im¬ 
plementations,  which  can  be  made  more  robust  or  ef¬ 


ficient  through  reimpiementation  (i.e.,  transformation 
into  a  modeling  notation  that  more  appropriately  sat¬ 
isfies  the  need). 

As  more  than  one  system  is  considered,  nrKxJels  can 
show  their  similarities  and  differences.  Systems  can 
be  grouped  into  families.  Some  models  focus  on  infor¬ 
mation  about  the  application  domain  (domain  models) 
while  others  focus  on  the  implementation  architec¬ 
ture.  Domain  models  and  domain-indeperxlent  archi¬ 
tectural  modeling  principles  are  combined  to  create 
domain-specific  architectures.  Those  architectures 
are  populated  with  components  and  adapted  to  the 
particular  application  needs.  The  result  is  a  technolo¬ 
gy  base  of  models  that  can  be  (re)used  for  a  number 
of  systems,  leveraging  existing  engineering  know¬ 
how.  Domain  analysis  and  architectural  analysis  con¬ 
tribute  to  the  population  of  this  technology  base,  while 
application  engineering  can  get  adapted  to  utilizing 
these  models  (see  Figure  2-4).  Furthermore,  the  tech¬ 
nology  base  can  be  expanded  by  the  emergence  of 
new  modeling  concepts,  e.g.,  safety  modeling. 
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Figure  2-4:  Engineering  Technoiogy  Base 


While  some  models  represent  the  executing  system 
itself,  other  models  reflect  constraints  the  system 
must  satisfy.  Those  are  models  used  to  validate  de¬ 
sired  system  behavior.  Examples  of  such  models  are 
assertions  validated  in  design  reviews  or  verification, 
or  translated  into  test  suites  and  test  data  validating 
the  behavior  of  the  running  system.  When  reengi¬ 
neering  a  legacy  system,  such  test  and  validation 
models  exist  and  have  stood  the  test  of  time.  They 
can  be  leveraged  for  verification  and  validation  of  the 
desired  system.  Depending  on  the  particular  migra¬ 
tion  path  to  the  desired  system,  alternatives  to  full  re¬ 
gression  testing  may  be  considered.  One  example  is 
validation  of  functional  equivalence  at  a  certain  level 
of  abstraction  through  comparison  of  event  traces 
[Britcher  90]. 

Engineering  decisions,  rationale,  and  alternatives 
complement  these  models.  They  nrtay  be  captured 
through  elicitation  processes  such  as  IBIS  [Micro¬ 
computer  Corporation  (MCC)].  The  models  together 
with  the  engineering  knowledge  are  known  in  other 
engineering  disciplines  as  experience  modules. 

In  this  idealized  view,  the  amount  of  engineering  infor¬ 
mation  available  to  the  engineer  grows  tremendously, 
resulting  in  information  overload.  In  order  to  cope  with 
this  situation  an  intelligent  intermediary  (intelligent  en¬ 
gineering  assistant  or  engineering  associate)  will  be¬ 
come  essential  to  the  successful  utilization  of  the 
system  understanding.  Technologies  that  are  poten¬ 
tial  contributors  to  this  notion  of  intelligent  assistant  in¬ 
clude  case-based  reasoning  and  intelligent  tutoring. 

2.4  Evolutionary  Migration  Path 

The  understanding  of  the  system,  both  the  current 
and  the  desired  system  state,  is  the  technical  basis  for 
determining  the  particular  reengineering  strategy  to 
be  chosen.  It  requires  analysis,  considering  alterna¬ 
tives,  and  making  engineering  tradeoffs.  Such  a  tech¬ 
nical  engineering  analysis  consists  of  two  major 
components:  choosing  the  degree  of  legacy  leverage, 
i.e.,  what  can  be  taken  over  and  what  has  to  be  newly 
created:  and  choosing  the  approach  for  migrating 
over  to  the  desired  system,  i.e.,  how  to  introduce  the 


changes  into  the  system.  The  reengineering  case 
study  by  Britcher  [Britcher  90]  nicely  illustrates  that  no 
single  approach  is  appropriate,  but  engineering  trade¬ 
offs  need  to  be  considered. 

Legacy  leverage  refers  to  the  ability  to  utilize  (recycle) 
as  much  as  possible  of  the  existing  system  in  the  pro¬ 
cess  of  evolving  to  the  desired  system.  Both  the  exist¬ 
ing  and  the  desired  system  can  be  described  in  terms 
of  a  collection  of  models.  For  the  legacy  system,  code 
exists.  Other  models  may  have  to  be  derived  from  the 
code  or  other  information  sources.  Certain  abstrac¬ 
tion  may  not  exist  in  the  legacy  system  or  may  reflect 
undesirable  properties.  The  goal  is  to  eliminate  unde¬ 
sirable  properties  while  at  the  same  time  introduce 
desirable  properties.  Choices  have  to  be  made  as  to 
which  legacy  system  models  to  ignore,  which  ones  to 
transform,  and  which  ones  to  leave  intact.  This  is  illus¬ 
trated  in  Figure  3-5.  The  choices  are  driven  by  our  un¬ 
derstanding  of  the  legacy  and  desired  system 
properties  as  well  as  their  reflection  in  the  different 
models.  In  concrete  terms  this  means  that  in  some 
cases,  undesirable  properties  of  legacy  systems  can 
be  eliminated  by  massaging  the  code  or  transforming 
the  data  representation,  while  in  other  cases  a  new 
architecture  or  data  model  has  to  be  developed  and 
only  a  few  system  components  can  be  translated  into 
the  new  implementation  language. 

The  change  can  be  introduced  in  a  number  of  ways. 
The  following  are  three  classic  approaches,  but  hy¬ 
brid  approaches  are  possible: 

•  Big  Bang  Approach:  The  desired  system 
may  be  built  separately  from  the  legacy  system, 
although  parts  of  the  legacy  system  may  have 
been  recycled.  Once  complete  the  new  system 
is  put  into  operation  while  the  old  system  is  shut 
down. 

•  Phase-out  Approach,  also  known  as  Incre¬ 
mental  Development:  The  architecture  of  the 
desired  system  may  be  created  and  a  skeleton 
implementation  developed.  A  mapping  between 
the  data  representation  of  the  legacy  and  the  de¬ 
sired  system,  implemented  as  a  two-way  trans¬ 
formation  filter  allows  the  skeleton  desired  sys¬ 
tem  to  run  as  a  shadow  of  the  “live”  legacy  sys- 
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implenientation  are  completed  and  incrementally 
added  to  the  skeleton.  This  approach  incremen¬ 
tally  phases  out  pieces  of  the  legacy  system. 

•  Phase-in  Approach,  also  referred  to  as  Ev¬ 
olutionary  Development:  The  legacy  system 
code  may  be  restructured  to  introduce  modulari¬ 
ty  and  partitioning.  Desired  system  properties 
are  incrementaliy  introduced  into  the  existing 
system  resulting  in  an  incremental  evolution  of 
both  the  architecture  and  the  system  compo¬ 
nents. 

Validation  of  the  desired  system  can  utilize  existing 
testing  capabilities.  Validation  can  be  decomposed 
into  validating  that  the  desired  system  still  provides 
equivalent  functionality  and  detection  of  bugs  in  the 
reimplementation. 

The  choice  of  the  particular  reengineering  strategy  is 
affected  by  the  risks  the  alternative  approaches. 
Risks  to  consider  are; 

•  Perceived  and  actual  undesirable  and  desir¬ 
able  system  properties 


•  Ability  to  eliminate  or  reduce  undesirable  sys¬ 
tem  properties 

•  Maturity  of  technology  inserted  into  the  sys¬ 
tem 

•  Introduction  of  new  technology  to  system 
maintainers  (reengineers) 

•  Impact  of  introduction  of  the  reengineered 
system 

•  Impact  of  system  changes  on  performance 
and  robustness 

•  Cost  and  time  of  reengineering 

In  summary,  reengineering  is  an  engineering  activity 
that  involves  system  understanding  and  evolution 
through  application  of  appropriate  engineering  prac¬ 
tices.  The  framework  outlined  here  does  not  promote 
particular  techniques  but  accommodates  emerging 
technologies  as  they  mature. 
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2.5  An  Engineering  Framework 

The  framework  presented  above  in  the  context  of  re¬ 
engineering  can  be  used  as  an  engineering  frame¬ 
work  for  software  intensive  systems.  A  full  discussion 
of  this  point  is  beyond  the  scope  of  this  paper.  The  fol¬ 
lowing  characterization  of  different  software  engineer¬ 
ing  processes  and  paradigms  serves  to  quickly 
illustrate  the  validity  of  this  claim; 

•  New  system  development:  The  system  to 
be  improved  in  the  application  environment  may 
be  a  system  performing  without  computer  sup¬ 
port  This  legacy  system  has  desirable  proper¬ 
ties  to  be  maintain^  and  undesirable  properties 
to  be  overcome.  For  example,  for  many  informa¬ 
tion  systems  the  data  nrxxjel  of  the  legacy  sys¬ 
tem,  though  not  documented,  may  be  directly 
applicable  to  the  desired  system.  In  traditional 
life-cycle  terminology  this  is  referred  to  as  the  re¬ 
quirements  phase,  ^ftware  recycle  is  applica¬ 
ble  only  if  parts  of  the  legacy  system  are  comput¬ 
er-based.  For  the  introduction  of  the  new  system 
the  same  migration  alternatives  may  be  consid¬ 
ered  as  discussed  in  the  context  of  reengineer¬ 
ing. 

•  Reengineering  of  future  systems:  Change 
in  the  application  environment  and  the  imple¬ 
mentation  environment  are  givens.  When  a  new 
system  is  being  defined,  customers  often  focus 
on  the  functionality  needed  to  address  their  par¬ 
ticular  problem  at  that  time.  Many  of  the  types  of 
changes  that  will  occur  over  the  lifetime  of  the 
system  and  their  implications  on  desirable  sys¬ 
tem  properties  are  not  considered  during  re¬ 
quirement  definition.  Reengineering  of  future 
systems  implies  that  engineering  for  change  and 
up-to-date  maintenance  of  a  system  understand¬ 
ing  (system  design  records)  occurs  from  the  out¬ 
set.  Engineering  for  change  requires  an  under¬ 
standing  of  commonly-accepted  changes  as  well 
as  an  anticipation  of  paradigm  shifts  due  to  new 
technology  and  localization  of  assumptions 
about  certain  environment  constants. 

•  Open  systems:  The  open  systems  concept 
has  gained  momentum  over  the  last  few  years, 
as  reflected  in  organizations  such  as  X-Open, 
Open  Systems  Foundation  (OSF),  and  the  User 
Alliance  for  Open  Systems.  This  concept  permits 
interoperability,  allows  rapid  technology  insertion 
and  upgrade,  encourages  alternative  solutions 
to  be  applicable,  and  provides  one  solution  appli¬ 
cable  in  a  number  of  systems.  Characteristics  of 


open  systems  are  modularity  and  standard  inter¬ 
faces.  These  are  desirable  properties  of  both 
legacy  and  future  systems  as  they  reduce  sys¬ 
tem  cost. 

•  Reuse:  Reuse  is  an  engineering  activity  that 
focuses  on  the  recognition  of  commonalities  of 
systems  within  and  across  domains.  It  consists 
of  the  creation  of  models  with  different  abstrac¬ 
tions  (ranging  from  code  components  to  domain 
models)  and  their  use  during  the  engineering  of 
an  application.  Thus,  the  focus  is  on  the  growth 
and  utilization  of  the  technology  base. 

•  Evolutionaiy  development:  Evolutionary 
development  focuses  on  designing  the  architec¬ 
ture  of  a  system  in  such  a  way  that  the  (^abili¬ 
ties  offered  to  the  user  can  grow  incrementally. 
New  capabilities  may  be  introduced  through  pro¬ 
totyping  of  new  system  components  (possibly 
utilizing  different  implementation  technology). 
Such  prototypes  interoperate  with  the  operation¬ 
al  system  and  may  get  hardened  through  incre¬ 
mental  reengineering. 

•  Megaprogramming;  Megaprogramming  fo¬ 
cuses  on  recognition  of  system  commonalities  at 
high  levels  of  abstraction  (e.g.,  architecture)  and 
creation  of  system  instances  through  parameter¬ 
ized  automatic  composition  or  generation. 

•  Model-Based  Software  Engineering 
(MBSE):  The  objective  of  MBSE  is  to  improve 
the  effectiveness  and  efficiency  of  producing 
software  intensive  systems  through  better  utili¬ 
zation  of  engineering  experience  and  system  un¬ 
derstanding.  MBSE  focuses  on  the  use  of  engi¬ 
neering  product  models  as  the  primary  means 
for  improving  the  construction  and  maintenance 
of  software. 
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3  SEl  Workshop:  October, 
1993 

The  SEl  conducted  a  reengineering  workshop  on  Oc¬ 
tober  26-27, 1993  to  identify  work  which  is  underway, 
and  to  develop  an  understanding  of  needs  which  are 
currently  unmet  and  which  can  be  addressed  by  the 
SEl  or  others.  Invited  participants  included  broad  rep¬ 
resentatives  of  the  OoO,  commercial  and  academic 
community.  Although  a  large  number  of  themes  were 
discussed,  there  was  broad  consensus  on  a  number 
of  basic  needs,  together  with  some  ideas  on  logical 
next  steps. 

The  October  workshop  will  be  followed  up  by  a  larger 
workshop  in  May,  1994  at  which  a  number  of  initial 
themes  will  be  developed  in  greater  detail,  and  at 
which  additional  community  resources  will  be  identi- 
fied. 

The  needs  expressed  by  the  practitioner  community 
were  quite  consistent  with  the  overall  approach  out¬ 
lined  in  this  paper.  Some  of  the  m^or  points  dis¬ 
cussed  at  the  workshop  are  outlined  below,  together 
with  preliminary  ideas  on  where  the  work  may  be  fol¬ 
lowed  up. 

The  expressed  needs  included  the  following: 

•  Definition  of  conceptuai  frameworks  and 
reference  modeis.  Several  groups  are  begin¬ 
ning  to  develop  a  common  conceptual  under¬ 
standing,  including  IEEE  and  follow-up  work  to 
the  JLC  Santa  Barbara  workshop  (Santa  Bar¬ 
bara  92).There  is  a  need  for  continuing  efforts  by 
these  groups  to  help  clarify  the  fundamental  con¬ 
ceptual  issues  of  reengineering.  The  concepts  of 
evolutionary  migration  path  and  the  model  tesed 
approach  outlined  in  this  paper  can  offer  some 
contribution  to  these  efforts. 

•  Domain  engineering  perspective.  Consis¬ 
tent  with  the  approach  outlined  earlier  in  this  pa¬ 
per,  the  workshop  identified  the  need  for  and  po¬ 
tential  benefit  of  incorporating  the  perspectives 
of  domain  engineering  and  architectures  into  an 
overall  understanding  of  reengineering.  These 
perspectives  offer  promise  for  helping  to  evolve 
from  an  ad  hoc  orientation  to  the  planned,  evolv- 
able  engineering  of  legacy  systems.  Among  the 
groups  which  are  currently  focusing  on  these  is¬ 


sues  are  SEl  and  DSSA.  In  addition,  this  is  a 
promising  area  for  cross  fertilization  with 
STARS,  DoD  reuse  initiatives,  PRISM  and 
CARDS. 

•  Engineering  process  models.  A  number  of 
organizations  and  companies  have  developed 
process  models  for  reengineering.  These  pro¬ 
cess  models  offer  a  certain  degree  of  common¬ 
ality.  An  analysis  of  the  coverage  of  the  different 
process  models,  together  with  empirical  data  on 
experiences  with  these  models  can  be  of  strong 
benefit  to  the  rest  of  the  community. 

•  Economic  and  cost  benefit  analysis,  and 
metrics.  There  is  a  strong  need  for  better  data 
on  cost  benefits,  a  better  understanding  of  re¬ 
engineering  economics  issues,  together  with 
more  refined  and  easily  accessible  metrics.  A 
draft  reengineering  economics  handbook  (MIL- 
HDBK-REH,  Draft,  93)has  been  developed  by 
the  Reengineering  Economics  Panel  from  the 
Santa  Barbara  workshop.  This  handbook  pro¬ 
vides  an  important  baseline  for  economic  issues 
and  will  be  refined  based  on  experience.  In  addi¬ 
tion,  empirical  data  on  actual  uses  of  cost  benefit 
analyses  and  the  application  of  metrics  pro¬ 
grams  are  urgently  needed. 

•  Decision  support  systems  and  project  se¬ 
lection  criteria.  A  recurrent  theme  concerned 
the  requirement  for  better  decision  support  mod¬ 
els  including  a  consideration  of  technical,  eco¬ 
nomic  and  transition  issues.  There  is  a  need  for 
comprehensive  work  to  integrate  these  perspec¬ 
tives  into  a  common  model  to  help  in  d^ision 
making  on  when  and  how  reengineering  can  pro¬ 
vide  benefit. 

•  Case  studies  and  lessons  learned.  Practi¬ 
tioners  cited  the  potential  usefulness  of  being 
able  to  learn  from  the  successes  and  failures  of 
those  who  have  attempted  similar  efforts  in  the 
past  The  follow-up  workshop  will  discuss  the 
feasibility  of  organizing  such  a  program.  System¬ 
atic  case  studies  can  represent  one  step  toward 
the  development  of  an  organized  community  of 
reengineering  practitioners. 

•  Management  and  transition  issues.  A 

number  of  critical  success  factors  which  distin¬ 
guish  between  successful  and  unsuccessful 
projects  have  been  management  and  transition 
issues,  including  the  management  of  expecta¬ 
tions.  A  number  of  the  insights  gained  from  tran¬ 
sition  management  and  CASE  adoption  are  rele¬ 
vant  and  can  be  codified  for  application  to  the  re¬ 
engineering  domain. 
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4  Conclusion 

Reengineering  has  been  presented  as  an  engineer¬ 
ing  problem.  As  such,  reengineering  draws  from  a 
number  of  software  engineering  technologies.  Ad¬ 
vances  in  best  practice  of  (re)engineering  benefit 
from  innovative  research  (i.e.,  creation  of  new  tech¬ 
nology  solutions);  analytical  research  (i.e.,  recogni¬ 
tion  of  promising  technologies  and  technology 
trends);  and  engineering  research  (i.e.,  the  matura¬ 
tion  of  technology  into  engineering  use).  Evolution  of 
conceptual  frameworks  for  subareas  of  software  engi¬ 
neering  can  be  successfully  leveraged  through  tech¬ 
nical  leadership  in  selected  community  forums.  In  the 
context  of  reengineering,  a  model  outlines  a  roadmap 
for  improvement  in  the  effectiveness  and  efficiency  of 
reengineering  systems. 
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Current  STSC  Reengineering  Projects: 


MIL*HDBK-RAH  Application  Findings, 
Reengineering  Project  Planning  Process, 
and  STSC  Reengineering  Survey  Results 

Michael  R.  01sem‘  and  Chris  Sittenauer^ 


‘Science  Applications  International  Corp.  ^USAF  Software  Technology  Support  Center 

OO-ALC/nSEC 
7278  4th  Street 
Hill  AFB,  UT  84056-5205 

RAH  Analysis  another  paper  at  this  conference. 


In  early  1992,  the  STSC  published  a 
process  for  matching  legacy  softwai systems 
with  reengineering  strategies.  Since  then, 
the  DoD's  Joint  Logistics  Commanders 
Workshop  on  Reengineering  in  Santa 
Barbara,  California  (September  1992)  used 
this  process  as  the  basis  for  the  reengineering 
strategy  selection  section  of  the  draft  DoD 
MIL-HDBK-RAH  (Reengineering 
Assessment  Handbook).  Comptek  is  now 
under  contract  to  the  USAF  Cost  Agency  to 
produce  version  1.0  of  the  RAH  by 
December  1993. 

The  authors  have  been  using  a  draft 
version  of  RAH  while  assessing  various 
legacy  software  systems  for  reengineering  at 
several  USAF  and  DoD  organizations 
nationwide.  Based  upon  our  findings,  we  are 
proposing  changes  to  the  RAH.  But  overall, 
we  find  the  results  of  the  RAH  analysis  to  be 
pretty  consistent  with  those  reengineering 
strategies  we  would  have  chosen  without  the 
handbook. 

This  paper  won't  dwell  on  the  details 
of  the  RAH  as  that  subject  is  being  discussed 


Reengineering  Surveys 

Besides  testing  the  draft  RAH  by 
assessing  reengineering  projects,  the  STSC 
has  initiated  an  in-depth  analysis  of  the 
correlation  between  reengineering  strategy, 
maintenance  environment,  and  potential 
retum-on-investment  (ROI).  We  have 
developed  two  surveys  for  DoD 
reengineering  projects.  Over  100  DoD 
reengineering  projects  have  been  identified 
and  are  participating  in  this  survey. 

The  first  survey  targets  reengineering 
projects  still  being  planned  or  underway. 
This  survey  tries  to  understand  the  reasons  an 
organization  attempts  a  reengineering 
project.  We  correlate  this  information  with 
the  reengiiieering  strategy  the  organization  is 
most  likely  to  use  versus  what  RAH  would 
suggest. 

The  second  survey  targets  completed 
reengineering  projects.  This  survey  looks  at 
how  "successful"  the  reengineering  project 
was  in  terms  of  maintenance  efficiency  and 
reduced  costs. 
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Both  surveys  are  now  being  mailed 
out  and  results  should  be  firm  by  the 
February  time  frame  of  the  NSWC 
conference.  Copies  of  both  surveys  have 
been  added  as  an  addendum  to  this  paper. 
The  reader's  comments  would  be  most 
welcome. 

Planning 

The  actual  reengineering  is  easy.  The 
planning  for  success  reengineering  is  hard. 

Matching  an  appropriate  reengineering 
strat^y  to  legacy  software  systems  is  but  one 
of  the  last  steps  necessary  for  a  successful 
reengineering  project.  TTie  planning  and 
preparation  prior  to  the  actual  reengineering 
is  absolutely  critical.  Unfortunately,  most  of 
the  USAF  organizations  we  advise  are  far  to 
eager  to  rush  right  out  and  buy  a 
reengineering  tool.  This,  we  believe,  is 
analogous  to  shopping  for  groceries  without 
a  list  of  projected  meals.  Not  only  do  you 
rarely  return  with  all  you  need,  but  you  end 
up  purchasing  items  you  don't  need  at  all. 

In  STSC's  1993  Software  Reengineering 
Technology  Report,  we  propose  a  9  step 
preparatory  planning  process  for 
reengineering  projects.  (We've  included  a 
copy  of  this  process  as  an  addendum  to  this 
paper.)  We  will  be  delivering  a  half-day 
tutorial  based  on  this  process  at  the  DoD's 
1994  Software  Technology  Conference 
(STC)  in  Salt  Lake  City,  Utah  and  at  the 
12th  Annual  Natiional  Conference  on  Ada 
Technology  (ANCOAT)  in  Williamsburg, 
Virginia. 


In  order,  the  reengineering  preparation  steps 
are; 

1. 

Evaluate  your  organization's 
business  needs 

2. 

Define  your  organization's 
development/maintenance 
environment 

3. 

Form  a  reengineering  team 

4. 

Create  a  standard  set  of 
software  metrics 

5. 

Ensure  your  standard 

testbed/validation  suite  is 
current  and  complete 

6. 

Analyze  your  legacy  systems 
(This  is  where  the  RAH 
becomes  useful) 

7. 

Analyze  the  available 
COTS/GOTS  reengineering 
tools.  (Something  the  STSC 
will  gladly  help  you  with.) 

8. 

Create  a  reengineering 
implementation  plan 

9. 

Train  your  reengineering 
team,  users,  and  maintenance 
staff 
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Preparing  to  Reengineer 

(Extract  from  STSC's  1993  Software 
Reengineering  Technology  Report) 

Introduction 

There  is  a  core  process  that  every 
organization  should  follow  when 
reengineering.  Automation  and  tools  can 
only  support  this  process,  not  preempt  it. 
Again,  this  comes  back  to  the  realization  that 
reengineering  is  not  a  silver  bullet  that  will 
solve  all  of  your  maintenance  problems.  The 
real  work  must  come  from  you.  Don't  make 
the  same  mistake  the  software  community 
made  with  CASE,  AI,  OOA,  etc.  Good 
technologies  do  not  necessarily  make  for 
good  software.  Reengineering,  without  a 
target  maintenance  environment  (i.e.  well- 
defined  and  consistent  with  your 
organization's  CMM  goals),  will  waste  your 
time  and  money,  leaving  you  with  the  same 
problems  you  now  face  and  possibly  worse 
off  if  your  experienced  maintenance  staff 
must  now  contend  with  code  they  no  longer 
understand. 

Reengineering  tools  will  prove  a  great 
help  in  moving  your  legacy  systems  to  a 
new,  hopefully  better,  maintenance 
environment.  But  you,  and  your 
organization,  must  define  this  environment, 
and  how  to  move  your  legacy  systems  there. 
We  propose  the  following  core  process,  or 
approach,  to  preparing  for  any  software 
reengineering  project. 

Step  1  -  Evaluate  Your 

Organization's  Business  Needs 

To  begin  with,  you  need  to  seriously 
look  at  your  organization's  business  needs 


and  goals.  Although  this  is  the  beginning  of 
business  reengineering,  it's  also  very 
important  to  your  software  reengineering 
projects.  Your  organization's  business  needs 
(or  the  outcome  of  your  organization's 
business  reengineering  process)  should 
dictate  what  your  development/maintenance 
process  will  look  like.  Besides  making 
logical  sense,  tailoring  your  process  to  meet 
your  business  needs  is  the  only  argument  that 
will  carry  any  weight  with  upper 
management.  You  know  them.  They're  the 
ones  that  will  ultimately  fund  your  efforts 
and  back  you  up  during  the  initial  dark  days 
when  the  enemies  of  change  will  be  charging 
at  you  from  all  sides. 

Step  2  -  Create  Your 

Development/Maintenance  Process 

Input  to  this  creation  process  should 
come  from  users,  programmer/analysts,  and 
managers.  Consult  the  programmers 
working  on  the  current 
development/maintenance  process.  Even  if 
the  current  maintenance  process  is  informal, 
it  should  have  merit.  Change  always  has  a 
better  chance  at  success  when  it  fits  in  with 
the  organization's  cultural  norms.  But  the 
users  may  want  faster  modification  turn¬ 
around.  And  the  managers  are  frustrated 
with  their  inability  to  make  the  software 
change  as  quickly  as  the  business 
environment  they  are  facing. 

The  creation  of  a 
development/maintenance  process  is  clearly 
beyond  the  scope  of  this  article.  It  is  much 
better  covered  by  SEI's  CMM.  But  be  aware 
that  this  step  is  time-consuming  and  must  be 
done  prior  to  reengineering.  If  your 
organization  already  has  a  well-defined,  well- 
regulated  development/maintenance  process, 
then  clearly  this  step  is  redundant. 
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Step  3  -  Form  Your  Reengineering 
Team 

Create  a  reengineering  team 
composed  of  those  programmer/analysts, 
users,  and  managers  that  are  knowledgeable 
and  open-minded  enough  to  at  least  look  at 
technical  change  to  solve  their  maintenance 
problems.  Choose  carefully.  They  should 
be  technically  competent,  credible,  patient, 
possess  good  social  skills,  and  know  the 
organization's  culture  well  enough  to 
function  successfully.  This  team  will: 

•  Understand  the  current  environment 
and  business  needs 

•  Establish  goals,  strategies,  and  action 
plans 

•  Provide  cost  justification 

•  Test  new  tools 

•  Purchase  tools 

•  Provide  internal  marketing, 
consulting,  and  service 

•  Train  others 

•  Continue  to  research  and  evolve 
technology 

•  Provide  vendor  liaison  -  partnerships 
to  advance  technology 

•  Improve  their  own  processes  using 
client  satisfaction  surveys 

Be  sure  the  user  community  is  well- 
represented.  Besides  their  expertise,  you 
need  their  support  and  buy-in.  In  addition, 
they  need  to  help  in  the  validation  of  the 


post-reengineered  system. 

Step  4  -  Select  a  Standard  Set  of 
Software  Metrics 

You  must  know  where  you  are  before 
you  can  get  to  where  you  want  to  be.  Upper 
management  will  demand  this.  You  must 
show  them  what  they  got  in  return  for  their 
money  and  support.  For  your  own  career's 
security,  you  need  metrics  to  prove  to  the 
doubting  Thomas's  that  process  control  and 
reengineering  was  worth  all  the  pain  you 
caused  them. 

But  beyond  the  political  concerns, 
metrics  helps  you  find  out  what  is  wrong  and 
how  to  fix  it.  Think  of  software  as  you 
would  an  ill  patient  in  a  hospital.  No 
competent  doctor  would  think  of  treating  a 
patient  without  knowing  all  the  symptoms 
and  test  results.  Similarly,  you  shouldn't 
treat  your  ill  and  dying  software  without 
knowing  what  symptoms  your  software  is 
suffering.  Metrics  and  software  analysis 
tools  gather  this  data  for  you.  But  just  as 
your  family  doctor  keeps  a  record  of  your 
health  over  the  years,  you  should  continue  to 
gather  metrics  on  your  systems  for  their 
entire  lifecycle.  You  and  your  doctor  need 
to  know  what  is  normal  in  order  to  react 
instantly  to  abnormal  signs. 

One  last  warning  about  metrics. 
They  must  be  non-threatening  to  the  troops. 
Don't  use  them  for  personnel  appraisals, 
judgments,  or  budgets  -  just  use  metrics  for 
process  improvement  and  continued  health  of 
your  software  systems.  Otherwise,  they'll 
start  reporting  what  they  think  you  want  to 
hear.  In  order  to  reward  quality,  we  require 
honest,  accurate,  and  consistent  data  from  the 
troops. 
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Step  5  -  Create  a  Standard  Testbed 
or  Validation  Suite 

This  is  a  terribly  overlooked  part  of 
the  reengineering  process.  How  do  you 
prove  the  end  product  of  reengineering  is 
functionally  equivalent  to  the  input  source 
code?  Whether  you  reformatted, 
restructured,  translated,  or  generated  code 
from  captured  design  data,  you  need  to 
validate  the  functionality  of  the  resulting 
software.  Critical  systems,  in  particular, 
need  to  demonstrate  unchanged  functionality 
after  reengineering.  (This  is  a  key  reason 
why  it  is  extremely  unwise  to  propose  any 
functional  changes  until  after  the 
reengineering  process  is  stabilized.)  So 
check  your  current  validation  suite  for  the 
target  software  system.  Ask  yourself  if  it's 
current  and  complete.  If  not,  take  the  time 
to  create  a  complete  testbed  for  the  target 
software  system.  If  the  target  software 
system  is  rather  large  and  you  decide  to 
adopt  an  incremental  approach  to  its 
reengineering,  then  you  ca::  validate/create  a 
validation  suite  incrementally  based  only 
upon  those  parts  of  the  system  to  be 
reengineered. 

Traceability  is  also  important. 
Design  functionality  must  be  associated  with 
the  source  code.  This  is  your  only  link 
between  the  new,  repository-based  design 
representation  and  the  old  source  code. 
When  looking  at  reengineering  tools,  ask  the 
vendors  about  traceability. 

For  a  much  more  in-depth  discussion 
of  testing  strategies,  we  strongly  recommend 
you  read  two  other  STSC  reports:  Test 
Preparation,  Execution,  &  Evaluation 
Software  Technologies  Report  and  Source 
Code  Static  Analysis  Technologies  Report. 


Step  6  -  Analyze  Your  Legacy 
Systems 

Each  target  system  will  have  different 
reengineering  needs  and  a  different 
reengineering  strategy.  Some  systems  may 
have  sufficient  documentation  but  your 
maintenance  process  requires  capturing  the 
derign  information  to  a  repository.  Another 
system  may  have  missing  or  inaccurate 
documentation  so  it  needs  redocumentation 
and  reverse  engineering  to  a  repository. 

Another  reason  to  analyze  your  l^acy 
systems  is  to  discover  their  unique  aspects 
regarding  eventual  reengineering.  Is  the 
system  comprised  of  several  programs  linked 
by  JCL?  Does  it  use  any  embedded  DBMS 
or  Assembler  macro  calls?  Are  other 
languages  embedded  or  called?  What  about 
any  online  screens?  Are  there  any  implied 
"dynamic"  0.e.  execution  time)  requirements 
such  as  response  time  constraints,  available 
memory,  etc.?  This  must  all  be  catalogued 
prior  to  the  next  stq)  of  finding  the  right 
reengineering  tool(s).  Moreover,  this 
information  must  also  be  captured  within  the 
design  repository.  When  picking  your 
repository  or  reengineering  tools,  find  out  if 
they  can  handle  such  data. 

During  this  process  step,  gather 
whatever  requirements  data  you  can  find. 
Currently,  reengineering  tools  cannot  supply 
you  with  requirements  information  from 
source  code.  But  you  need  requirements  data 
to  effectively  maintain  your  software 
systems.  Talk  to  the  maintainers,  users, 
management,  and  anyone  else  who’s  worked 
with  the  system  from  the  time  it  was 
developed.  Then  ask  the  vendor  whether 
their  repository  can  store  requirements  data. 

There  are  numerous  tools  to  help  you 
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in  gathering  analysis  data  on  your  legacy 
systems.  These  tools  will  help  you  analyze 
the  complexity  of  your  software,  check  the 
system's  degree  of  structuredness,  evaluate 
the  impact  of  a  given  modification,  and 
identify  standards  violations  (e.g.  non- 
initialized  variables,  dead  code,  etc.).  This 
data  will  help  during  the  validation  phase  and 
supply  objective  data  to  support  your 
selection  of  the  most  appropriate 
reengineering  strategy  applicable  to  ^e  target 
software  system.  Program  analysis  tools  do 
not  modify  source  code.  Running  program 
analysis  tools  before  and  after  reengineering 
provides  a  means  of  measuring  the 
effectiveness  of  the  reengineering.  These 
tools  are  described  in  much  more  detail 
(along  with  a  list  of  such  tools)  in  the 
STSC's  1993  Test  Preparation,  Evaluation, 
and  Execution  Software  Technologies  Report 
and  Source  Code  Static  Analysis 
Technologies  Report. 

Step  7  -  Reengineering  Tools 
Analysis 

All  the  preceding  steps  should  be 
accomplished  prior  to  your  initial  efforts  at 
reengineering  tools  analysis.  Only  within 
such  a  context  can  reengineering  tools  help 
your  maintenance  process.  "Buying  software 
tools  the  way  most  software  development 
organizations  do  is  like  going  to  the  grocery 
store  without  a  list  -  when  you  get  home,  not 
only  do  you  not  have  everything  you  need, 
you  have  a  lot  of  stuff  you  don't  need" 
(remark  made  by  Denis  Meredith  during  the 
8th  International  Conference  and  Exposition 
on  Software  Maintenance  &  Re-engineering, 
Washington,  DC,  USPDI,  Silver  Spring, 
MD,  5-9  August  1991). 


reengineering  goals.  They  should  also  fit  the 
needs  of  your  target  systems  as  defined 
above.  Be  aware  that  there  may  not  be  a 
COTS  reengineering  tool  that  satisfies  your 
requirements.  Your  source  code  language 
may  not  have  a  sufficiently  large  user  base  to 
warrant  any  vendor  creating  a  reengineering 
tool.  Or  there  may  exist  reengineering  tools 
for  your  target  source  code  language  but  they 
won't  capture  JCL,  screen,  DBMS  calls, 
macros,  etc.  You  may  then  need  to  decide 
whether  the  discrepancies  between  the  tool's 
functionality  and  the  legacy  system's 
characteristics  are  sufficiently  small  to  still 
make  the  tool  usable. 

You  may  have  to  "massage"  your 
legacy  software  so  that  your  reengineering 
tools  can  process  it.  This  may  take  several 
forms: 

•  Most  reengineering  tools  run  on 
workstations  or  PC's.  Thus,  if  your 
software  resides  on  the  mainframe, 
there  is  the  issue  of  downsizing  or 
porting  your  legacy  software  from 
your  mainframe. 

•  If  the  tool  will  only  process  COBOL 
85  and  your  legacy  systems  are 
written  in  COBOL  74  and  COBOL 
68,  use  a  translator  to  upgrade  your 
COBOL  code  to  85  first. 

•  If  your  legacy  systems  have 
embedded  macros,  DBMS  calls,  other 
language  calls,  etc.,  and  your 
reengineering  tool  cannot  process 
anything  outside  the  primary  source 
code  language,  then  stub  out  these 
areas. 


Find  the  reengineering  tools  that  will 
accomplish  your  maintenance  and 
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Tool  integration  is  also  an  important 
consideration.  As  discussed  above,  no  single 
reengineering  tool  will  match  all  your 


requirements.  But  where  one  tool  may  fall 
short,  another  may  pick  up  the  slack.  So 
check  whether  the  reengineering  tool  can 
integrate  with  other  reengineering  tools,  your 
chosen  repository,  the  hardware 
platform/OS,  and  the  target  maintenance 
process/tools. 

Step  8  -  Create  a  Reengineering 
Process 

Based  on  the  preceding  steps,  you 
must  now  define  your  reengineering  process. 
Step  by  step,  decide  what  needs  to  be  done, 
taking  into  account  the  legacy  software 
characteristics  (starting  point),  your  new 
maintenance  process  (finish  point),  and  all 
intervening  steps  such  as  validation,  metrics, 
tool  int^ration,  training  etc.  Set  up  a  viable 
schedule  with  milestones  to  report  back  to 
management. 

One  cautionary  note:  it  is  very 
tempting  to  include  requirements 
modifications  during  the  reengineering 
process.  DON’T!!!  This  creates  problems 
with  functional  validation  and  complicates 
the  whole  project  immeasurably.  Wait  until 
the  reengineering  project  is  complete  and 
stable  (i.e.  functionally  validated)  before 
changing  and  implementing  modifications. 

It  is  at  this  point  that  you  must  make 
a  basic  decision  regarding  your  reengineering 
process  -  evolution  vs.  revolution.  There  are 
three  fundamental  strategies  to  approach 
reengineering: 

•  Systems  reengineering 

An  entire  system  is  reengineered.  Systems 
reengineering  can  be  used  for  one-time 
reengineering  projects  where  you  need  to 
solve  an  immediate  problem  for  a  particular 


application.  The  advantage  to  this  strategy  is 
that  the  system  is  brought  into  the  newly 
defined  maintenance  environment  all  at  once. 
A  disadvantage  is  the  amount  of  risk  your 
organization  runs.  The  full  system,  after 
reengineering,  must  work  flawlessly.  But 
can  you  guarantee  that?  Is  the  functionality 
intact?  Were  any  bugs  introduced  during  the 
reengineering?  Remember,  there  is  a  good 
deal  of  human  intervention  with  any 
reengineering  project.  If  you're  confident 
you  can  control  these  potential  problems, 
then  systems  reengineering  will  allow  you  to 
quickly  switch  over  from  the  old  system  to 
the  reengineered  system.  Otherwise,  perhaps 
one  of  the  following  strategies  would  be 
better. 

•  Incremental  reengineering 

Parts  of  the  system  are  reengineered  as 
needed  thus  creating  a  new  "version"  of  the 
system.  When  a  system  modification  is 
required,  take  the  opportunity  to  reengineer 
only  those  parts  of  the  system  that  need  to  be 
changed.  The  advantage  to  this  approach  is 
a  reduced  risk  when  switching  over  to  the 
new  system.  Only  portions  of  the  system 
were  changed  and  these  portions  were  clearly 
identified.  Should  a  problem  occur,  it's 
origin  can  be  clearly  traced.  The 
disadvantage  lies  in  the  number  of  interim 
versions  of  the  system  generated  until  the 
entire  system  is  reengineered  eventually. 
Each  version  would  contain  some 
reengineered  code  and  some  not. 
Configuration  control  considerations  are 
criticaJ  for  this  strategy  to  work. 

•  Partial  reengineering 

Functionally  cohesive  sections  of  the  system 
are  reengineered  as  needed.  This  is  similar 
to  both  systems  and  incremental 
reengineering.  When  a  system  modification 
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is  required,  only  those  parts  of  the  software 
affected  by  the  modification  (e.g.  sub¬ 
program  or  called  routine)  are  reengineered. 
But  the  rest  of  the  system  is  left  alone  -  until 
the  next  modification.  The  advantage  to  this 
strategy  is  the  inherent  advantage  of  modular 
design.  If  a  problem  occurs  after  switch 
over,  just  re-implement  the  old  module.  The 
disadvantage  lies  in  interface  issues.  If  one 
or  more  modules  reside  in  a  repository  but 
the  remainder  of  the  system  is  executed  the 
same  old  way,  they  must  interact  flawlessly. 
Other  questions  to  consider  include:  will 
there  be  any  response  time  degradation  that 
is  unacceptable  (particularly  for  real  time 
systems);  and  will  the  generated  code  (either 
from  a  repository  or  restructurer)  interface 
correctly  with  the  remainder  of  the  legacy 
source  code? 

If  reengineering  is  viewed  as  the  path 
to  long-term  software  maintenance 
improvement  for  your  organization,  then  we 
recommend  incremental  or  partial 
reengineering.  But  if  you  need  a  short-term 
solution  to  an  immediate  problem  for  a 
particular  application,  then  systems 
reengineering  might  be  the  better  strategy. 

Step  9  -  Train,  Train,  Train... 

Train  your  reengineering  team.  They 
need  to  understand  several  key  concepts: 

•  How  to  manage  technological  change 

•  The  basics  of  reengineering 

•  How  to  use  the  selected  reengineering 
tools 

•  How  to  use  the  target 
development/maintenance  process 


Because  of  the  training  and  time 
investments  in  your  reengineering  team,  you 
should  seriously  consider  making  the  team 
permanent.  Reengineering  is  an  ongoing 
process.  Any  moderate  to  large  software 
organization  will  have  more  than  enough 
legacy  software  to  keep  such  a  team  busy  for 
the  next  several  years  at  least. 

Summary 

Reengineering  requires  a  core  process 
of  manual  tasks  prior  to  buying  your  first 
tool.  There  is  no  avoiding  these  tasks.  They 
are  common  throughout  all  successful 
reengineering  projects.  Obviously,  short¬ 
term,  heavy-duty  resources  must  be  allocated 
up  front  to  gain  the  co.st  savings  over  the 
long  haul.  Decide  now  whether  you  can 
gather  the  support  for  this  level  of 
commitment.  Smaller  pilot  projects  are 
usually  a  good  way  of  proving  reengineering 
concepts  ajid  their  cost  savings.  But 
whatever  political  strategy  you  employ,  don't 
attempt  reengineering  without  each  of  the 
steps  outlined  above. 
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A  Road  Map  Guide  to 
Software  Reengineering  Technology 


Roberts.  Arnold 


Purpose  and  Structure 

This  paper  introduces  the  reader  to  software  reengineering  definitions  and  technology.  Software  ieengineering 
technology  supports  three  major  themes:  (1)  understanding  software,  (2)  improving  software,  and  (3)  capturing, 
preserving,  and  extending  knowledge  about  software.  This  paper  wiU  help  the  reader  see  the  boundaries  of 
reengineeting  technology  and  s^jpredate  some  of  its  risks. 

The  reader  is  assumed  to  be  interested  in  the  nuuntenance,  improvement,  or  understanding  of  existing  source  code. 
This  interest  may  be  of  itself,  tx'  part  of  a  larger  task,  such  as  converting  software  fiom  one  operating  system  to 
another.  This  ps^r  will  help  reengineering  nonexperts  appreciate  reengineering  issues.  Experts  will  see  a  fi«sh 
contemporary  viewpoint 

The  pr^  is  structured  as  foUows:  “Reengineering  Definitions’’  defines  reengineering  terms  and  places  them  in 
context’The  SignificarKS  of  Reengineeting’’  discusses  why  reengineering  is  significant  and  worthy  of  die  reader’ s 
time  and  study.  “Reengineering  Technology’’  discusses  reengineering  technology  themes  and  connects  technology 
areas  widr  die  themes.  It  briefly  discusses  the  importance  and  significance  of  the  technology  areas.  “Reengineering 
Strategies  and  RiskMitigation’’ discusses  risks  and  caudons  in  usingteengineering  technology.  “Future  Advances’’ 
discusses  future  research  issues  for  reengineering,  i^pendix  A  provides  a  procedure  for  classifying  a  software 
transfexmation  consistent  with  the  reengineeting  definition  used  here. 

Reengineering  Definitions 

Software  reengineering  is  any  activity  that 

(1)  improves  one’s  understanding  of  software,  or 

(2)  prepares  or  improves  the  software  itself,  usually  for  increased  maintainability,  reusability,  or 
evolvability. 

In  this  definition,  the  term  “software”  includes  —  in  addition  to  source  code  —  documentation,  graphical  pictures, 
and  analyses.  The  analyses  are  about  source  code,  designs,  specifications,  test  data,  and  other  documents  directly 
supporting  software  development  or  maintenance. 
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Part  1  of  this  definition  includes  activities  siKh  as  browsing,  measuring,  drawing  pictures  of  software,  documenting, 
and  analyzing.  Part  2  includes  activities:  designed  to  improve  static  qualities  of  software,  usually  so  the  software 
is  easier  for  people  to  woilc  with. 

Part  2  lends  to  exclude  modifications  whose  purpose  is  not  for  maintainability,  reusability,  evolvability,  or 
improving  one’s  understanding  of  the  software,  ^r  example,  optimizing  code  or  restructuring  it  purely  for 
performance  is  not  commonly  thought  of  as  reengineering.' 

Reverse  engineering  pertains  to  part  1  of  the  reengineering  definition.  Reverse  engineering  generates  < 
information  about  a  software  representation  (such  as  source  code)  to  help  one  understand  it  or  to  facilitate 
its  processing. 

Other  reengiiieeriiig  definitions 

Different  people  or  groups  seem  to  have  different  meanings  for  reengineering.  For  example,  GUIDE  defines 
reengineering  as 

the  process  of  modifying  the  internal  mechanisms  of  a  system  or  program  or  the  data  structures  of 
a  system  or  program  without  changing  its  functionality  [GUIDE89]. 

Chikofsky  and  Cross  define  reengineering  as 

the  examination  and  alteration  of  a  subject  system  to  reconstitute  it  in  a  new  form  and  subsequent 
implementation  of  that  form  [Chikofsky90]. 

Forreference  in  this  discussion,  the  reengineeringdefiniticMiidesciibedintheprevious  section, theGUIDEdefinition, 
and  the  Chikofsky  and  Cross  definition  will  be  called  definitions  A,  G,  and  C,  respectively.  Both  definitions  G  and 
C  are  valuable  and  useful.  The  reengineering  definititxi  cme  uses  dqiends  on  one’s  perspective.  Reengineering 
definition  A  is  used  in  this  paper  for  several  reasons. 

0 

First,  definition  A  centers  on  the  purpose  of  reengineering  activities,  rather  than  on  their  means  or  processes.  This 
recognizes  that  reengineering  activitiescan  use  technology  that,  in  other  contexts,  maynotbe  called  “reengineering.” 
For  example,  impact  analysis  and  software  testing  are  used  in  software  maintenance,  but  not  always  in  the  context 
of  reengineering. 

Second,  on  close  examination,  definitions  G  and  C  allow  different  activities  to  be  called  “reengineering.”  Definition 
A  is  more  inclusive.  For  example,  under  definition  C  one  might  classify  a  software  functionality  change  as 
teengineeting.  But  under  definition  G  one  would  not  Under  definition  A  one  could  classify  a  functionali^  change 
as  reengineering,  provided  it  was  for  the  purposes  mentioned. 

As  another  example,  definition  G  does  not  consider  creating  information  about  software  as  reengineering, 
unless  the  information  is  used  to  support  modification.  Definition  C  does  define  creating  information  about 
software  as  reengineering,^  but  only  if  it  is  on  the  path  to  reimplementation.  Definition  A  allows  the  creation 
of  information  as  an  end  in  itself  as  part  of  reengineering.  For  example,  definition  A  considers  as 
reengineering  the  creation  of  information  about  software  to  facilitate  understanding  or  to  promote 
maintainability,  reusability,  or  evolvability. 


'The  context  of  reengineering  is  expanding  so  rapidly,  however,  that  some  people  include  even  these  as  reengineering  — 
improving  suitability  for  use  —  of  the  code. 

'If  we  equate  the  information  with  “a  new  form”  in  definition  C. 
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Third,  cLanition  G  tends  to  focus  on  changes  to  source  code.  Improvements  of  non-source  code  items,  such  as 
documentation  and  specifications,  may  also  be  considered  as  reengineering.  Such  activities  are  allowed  in  definition 
A.  and  may  be  allowed  in  definition  C. 

This  discussion  should  not  be  taken  to  imply  that  one  reengineering  definition  is  better  than  another.  The  definitions 
capture  different  perspectives.  Perspective  change  is  common  in  a  rapidly  evolving  field  such  as  reengineering. 
Because  of  the  proliferation  of  reengineering  definitions,  the  reader  should  ask  people  what  they  mean  by 
reengineeiing  when  more  than  a  general  understanding  is  important^ 

llie  discussion  in  this  paper  embodies  an  approach  to  classifying  reengineering  and  similar  activities,  relative  to 
reengineering  definition  A.  In  this  approach,  a  written  definition  of  reengineering  is  created.  (This  was  done  in  the 
“Reengineeting  Definitions’*  section.)  Then  the  context  of  reen^neering  and  related  terms  is  diagrammed.  The 
diagram  features  various  views  of  iiifonnation  and  the  transitions  among  the  views.  (Figure  1  is  such  a  diagram, 
and  is  discussed  in  ‘The  Context  of  Reengineeting.”)The  transitions  in  the  diagram  are  classified  in  a  table.  Hnally, 
a  decision  procedure  is  created  for  classifying  an  activity  using  the  table.  (Table  A2  and  the  decision  procedure 
appear  in  Appendix  A.) 

Reengineering  spellings  and  related  terms 

Just  as  there  are  no  universally  accepted  reengineering  definitions,  there  are  no  universally  accepted  spellings  for 
reengineering.  The  most  common  spdiings  are  reengineering  and  le-engineaing.  “Reengineering”  is  used  in  this  paper. 

Synonyms  for  reengineering  abound,  often  with  nuances  reflecting  a  specific  reengineering  purpose: 

•  improvement 

•  renewal 

•  renovation 

■  refurbishing 

•  modernization 

'  redevelopment  engineering 

•  reclamation 

■  reuse  engineering 

The  terms  improvement,  renewal,  renovation,  refurbishing,  and  redevelopment  engineering  all  have  similar 
meaning:  improving  software  for  evolvability  and  further  use.  Modernization  iiKludet>  improvement  of  software, 
but  may  go  beyond  by  Improving  software  development  and  maintenance  activities  surrounding  the  software. 
Reclamation  and  reuse  engineering  refer  to  reengineeting  to  make  source  code  more  reusable. 

The  context  of  reengineering 

Figure  1  illustrates  a  framework  for  understanding  reengineering  and  related  terms.  The  figure,  an  update 
to  a  similar  one  in  (Chikofsky90],  reflects  evolving  connotations  of  terms.  Five  ideas  are  shown  in  Figure 
I: 


(1)  views  of  software, 

(2)  information  base, 

(3)  decomposition, 

(4)  composition,  and 

(5)  transformation. 


'As  in  negotiating  contracts  for  reengineering  work. 
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CeinnlgW  0 1M1  by  Hobart  S.  AnwW.  Utad  by  panntaaion. 

Flgurt  1 .  RMogioMring  and  raiatad  tarma.  Raenginaaring  and  ralatad  tachnology  may  ba  viawad  aa  trenaforming  Information 
In  ona  aottwara  viaw  to  Information  In  anothar  aoftwara  vfaw.  Tfw  tranaformation  may  mova  Information  into  and  out  of  the 
infonnation  baaa. 


Understanding  these  ideas  helps  distinguish  the  temts  presented  in  Figure  1 . 

A  software  view  is  a  representation  of  software  or  a  report  about  software.  A  software  view  may  be  for  human 
viewing  or  not,  but  it  typically  is  a  significant  interim  representation  of  software  that  humans  may  want  to  see.  In 
this  discussion,  the  word  view  refers  to  the  type  of  view  (e.g.,  a  data  flow  diagram).  View  information  means  the 
specific  information  in  a  view  (e.g.,  a  specific  data  flow  diagram  D),  or  the  information  base  of  knowledge 
decomposed  from  information  in  the  view. 

Examples  of  software  views  are  specifications,  source  code,  measurements,  reports  derived  from  static  source  code 
analysis,  and  test  data  used  to  characterize  software  behavior.  Figure  2  has  several  sample  views,  with  possible 
view  information  shown  as  pictures.  When  a  view  is  supported  by  a  tool,  it  nearly  always  comes  with  a  view  editor, 
to  support  entering,  browsing,  and  changing  view  infomuUion. 

As  implied  by  Figure  1 ,  views  can  be  grouped  into  four  classes: 
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Specifications 

Hierarchy  charts 

Flowcharts 

C—DCIZlIZp 

r-L, 

1 _ 1 

View  editor 

View  editor 

View  editor 

CopyitgM  •  IMS  by  Habwt  t.  AmoW.  UMb  by  p»twli«lDn. 

FIguni  2.  Miittlpla  •ottww*  vtowt.  Many  raanginaaring  tools,  aspacially  CASE  toolsats,  aupport  sovarsi  dHfarant  viawa  of 
aoftwaia.  No  tool  as  yat  supporta  all  tha  vlaws  abova,  but  Vila  glvaa  an  Maa  of  wtwt  can  ba  found  In  a  vlaw. 


Class  1:  Nonprocedural-  and/or  meta-oriented  views. 

Class  2:  Pseudoprocedural-  and/or  architectural-oriented  views. 

Class  3:  Highly  procedural  views,  or  close  derivatives. 

Class  A:  Analysis  views  that  may  accompany  any  other  view. 

Class  1  contains  views  that  are  nonprocedural  and/or  meta.*  Software  specifications  and  conceptual 
schemas  fall  into  Class  1.  Class  2  contains  views  that  are  pseudo-procedural  or  architectural.  Designs, 
program  design  language  descriptions,  and  software  architecture  diagrams  (such  as  calling  hierarchies  or 
data  flow  diagrams)  fall  into  this  class.  Class  3  contains  highly  procedural  information,  information  closely 
associated  or  derived  from  this  information,  or  direct  information  about  representations.  Source  code, 
program  slices,  data,  data  definitions  in  source  code,  objects  and  relationships  decomposed  from  views,  and 
syntax  trees  are  in  this  class. 

Class  A  contains  analysis  views  derived  from  any  of  the  other  views.  Fot  exan^ile,  software  metrics  are  derived 
by  analyzing  software.  An  analysis  view  can  appear  with  information  in  any  of  the  other  views.  Class  A  can  be 
divided  as  follows: 

Class  Al:  Analyses  pertaining  to  Class  1  views. 

Example:  Fog  index  of  specification  text. 


*A  “meta"  view  is  a  view  about  something.  The  intent  here  is  nonprocedural  meta  views,  such  as  data  schemas  or  decision 
tables. 
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Class  A2:  Analyses  pertaining  to  Class  2  views. 

Example:  Coupling  levels  of  source  modules. 

Class  A3;  Analyses  pertaining  to  Class  3  views. 

Example:  Number  of  modules  in  source  code. 

In  practice,  view  classes  1 , 2, 3,  and  A  ate  not  disjoint.  For  example,  some  people  may  place  a  program  design 
language  (PDL)  description  in  Class  3,  others  in  Class  2.  However,  it  is  assumed  for  the  rest  of  the  discussion  that 
it  is  possible  for  an  individual  to  create  disjoint  classes  for  himself  or  herself,  if  needed.  For  the  rest  of  the  paper, 
it  is  assumed  that  the  classes  are  disjoint 

The  information  base  is  the  repository  of  information  about  the  software.  It  is  loaded  in  three  ways: 

(1)  Decomposing  software  into  objects  and  relationships, 

(2)  Incrementally  building  up  objects  and  relationships  through  tools  that  build  on  or  add  to 
knowledge  in  the  information  base,  and 

(3)  Importing  information  from  other  information  bases. 

Decomposition  is  the  process  of  transforming  a  view  into  objects  and  relationships  stored  in  the  information  base. 
For  example,  compilers  commonly  decompose  programs  into  abstract  syntax  bee  representations. 

Composition  generates  view  information  from  information  in  the  information  base.  The  composer  (the  tool  or 
person  that  does  the  conqxKition)  assembles  view  information  by  finding  relevant  objects  and  relationships  in  the 
information  base,  then  adding  view  formatting  as  needed  to  display  the  view  information.  For  example,  the  back 
end  of  a  compiler  commonly  generates  code  by  traversing  a  semantic  gra{^  of  the  program,  or  some  equivalent 

The  notion  of  transformation  is  central  In  Figure  I,  reengineering  trinsforms,  in  effect,  information  from  one 
software  view  into  information  in  another  view,  at  the  same  or  earlier  view  class.’ 

Examples  of  reengineering  transformations  are  transformations  from  source  code  (Gass  3)  to  resbtictured  source 
code  (Gass  3),  updated  designs  (Gass  2),  corrected  specifications  (Gass  1),  or  computed  static  measurements 
(Gass  A3).  The  reengineering  transformation  usually  “improves”  the  information  in  the  view  according  to  some 
criterion.*  Software  restructuring  is  reengineering  centered  on  bansforming  the  source  code’s  structure  (syntax  and 
semantics). 

Transformation  also  underlies  reverse  engineering  and  design  recovery.  Reverse  engineering  is  like  reengineering, 
except  the  origin  and  target  views  aredifferent,  the  target  view  normally  being  in  an  earlier  view  class.  Transforming 
source  code,forexample,intosttucturechatts  may  beeitherreengineering  or  reverseengineering.  But  transforming 
source  code  to  restructured  source  code  is  reengineering  or  restructuring,  not  reverse  engineering.  Updating 
embedded  source  code  comments  is  neengineeting. 

Reverse  engineering  has  other  meanings,  though  these  are  not  used  here.  For  example,  it  can  mean  to  analyze  a 
detailed  representation  to  discoverits  inherent  design.  One  sometimes  hears  that  people  perform  reverse  engineer  mg 
to  determine  source  code  from  object  code.  For  other  meanings  of  reverse  engineering,  see  [RekoffSS]. 

Design  recovery,  a  subset  of  reverse  engineering,  generate^  l.iformation  about  software.  Often  the  information  is 


’Even  though  some  reengineeiing  tasks,  such  as  purely  manual  inspection  and  improvement  of  source  code,  do  not  involve  an  online 
information  base,  they  can  still  fit  this  model.  Just  substitute  information  in  the  person's  head  for  information  in  the  information  base. 

‘A  frequent  criterion  is  that  reengineering  should  change  form  but  not  function.  For  example,  restructuring  code  to  improve 
readability  is  also  reengineering. 
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software  relationships 


]  =  well-structured/understandable  software 

=  poorly  structured/not  understandable 
software 


CopyrtQht  c  1tl2  by  Rotert  S.  Arnold.  Uood  by  ponnUolon. 

Figura  3.  Dynamics  of  maintananca  and  Information  loss.  Matntananca  tends  to  make  aoftwara  harder  to  change  through  loss 
of  Information  about  how  to  modify  It  Fewer  and  fewer  people  know  less  and  less  about  the  software.  This  happens  because 
people  leave  who  understand  the  software,  and  because  the  software  Itself  becomes  harder  for  new  programmers  to 
understand.  Mental  connections  are  lost 


not  easily  extractable  from  the  software  and  associated  documentation,  and  requires  considerable  effon  to  deduce. 
Three  examples  are  generating  rationales  describing  why  the  software  is  in  its  current  form,  generating 
specifications  from  source  code,  and  generating  black  box  test  data  sets  for  software  without  documentation  that 
is  cunent  Appendix  A  gives  another  interpretation  of  design  recovery  in  terms  of  views. 

Forward  engineering  is  a  transformation,  usuaUy  from  an  earlier  to  a  later  view  class.  For  example,  generating 
source  code  from  a  data  flow  diagram  is  usually  a  forward  engineering  activity. 


Reengineering  can  be  considered  in  other  ways  besides  transformations  of  view.  The  following  section  discusses 
the  purposes  behind  reengineering. 

Significance  of  Reengineering 

Reengineering  is  important  for  several  reasons: 


(1)  Reengineering  can  help  reduce  an  organization’s  evolution  risk. 

To  extend  software  capabilities,  organizations  can  build  new  software,  evolve  existing  software,  reengineer  and 
evolve  existing  software,  use  application  generators,  or  obtain  software  parts  or  packages.  When  the  latter  two 
options  are  not  available,  organizations  are  faced  with  building  new  software  or  evolving  existing  software.  Simpl  y 
manually  evolving  existing  source  code  tends,  in  current  practice,  to  make  the  software  harder  to  change,  or  less 
reliable  when  changed  (see  Figure  3).  Building  software  from  scratch  can  be  expensive  and  uncertain. 
Reengineering  software  and  evolving  it  sometimes  offers  less  change  risk.  It  can  help  safeguard  an  organization’ s 
software  investment  better  than  building  software  from  scratch  or  simply  evolving  it  through  traditional 
maintenance. 
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Copyright  6  1992  by  Robort  S.  Amokt  Uood  by  pormlMlon. 

Figure  4.  Reengineering  and  CASE  tooiseta.  Moat  CASE  toolaeta  atartad  aa  forward  engineering  tooia.  For  added  vaiue,  CASE 
toolmakere  have  added  reengineering  and  reverae  engineering  capabllltiea.  Moat  entry  pointa  to  theae  capabllitiea  currently 
depend  on  aource  code  oniy. 


(2)  Reengineering  can  help  an  organization  recoup  its  investment  in  software. 

Companies  have  spent  hundreds  of  thousands  of  dollars  building  software.  The  software  industry  has  spent 
billions.  Rather  than  ignore  existing  software,  companies  can  use  reengineering  to  partially  recoup  their 
software  investment.  Reengineering  helps  organizations  build  on  existing  software. 

(3)  Reengineering  can  make  software  easier  to  change. 

Improving  software  can  pay  rich  dividends.  It  speeds  productivity  of  the  maintenance  progranuner  by  making 
code  easier  to  understand  or  work  with.  It  gives  an  organization  more  flexibility  because  its  software  can  be 
modified  more  quickly  to  accommodate  business  changes.  Reengineering  extends  an  organization’s  options. 

(4)  Reengineering  is  big  business. 

The  1990’s  market  for  reengineering  services  and  tools  has  been  estimated  to  be  in  the  billions.  As  reported 
in  [Computerworld  91],  the  estimated  expenditures  allocated  to  reengineering  services  in  1990  was  $4.6 
billion.  For  1995  the  estimate  is  $11.9  billion.  The  estimated  expenditures  allocated  to  reengineering 
products,  including  back-end  CASE  tools,  was  $.8  billion  in  1990.  For  1995  the  estimate  is  $2.7  billion. 
Many  software  systems,  and  system  parts,  need  updating.  Software  contractors  and  service  companies  are 
pursuing  work  in  this  area.  Many  organizations  are  looking  for  reengineering  techniques,  tools,  and  processes 
to  use. 

(5)  Reengineering  capability  extends  CASE  toolsets. 

Reengineering  helps  new  techniques  and  tools  to  be  applied  to  old  software.  This  has  several  benefits.  For 
maintainers,  it  allows  newer  and  more  jxjwerful  tools  to  help  diem  maintain  software.  Reengineering  may  be  seen 
as  a  kind  of  technology  transfer  vehicle,  allowing  old  systems  to  be  brought  into  frameworks  of  new,  more  powerful 
tools.  For  CASE  tool  vendors,  adding  reengineering  tools  to  their  toolsets  opens  new  markets  of  existing  software. 
It  also  gives  important  added  value  to  their  previously  forward-engineering-only  toolsets  (see  Figure  4). 


388 


Introduction:  A  Road  Map  Guide  to  Software  Reengineering  Technology 


Copyright  C 1992  by  Roboit  S.  Arnold.  Uood  by  pormloslon. 

FIgura  5.  Automatic  raanglnaarlng  procaat.  Roanglnaofing  toola  tand  to  follow  a  common  framawork.  Thay  parse 
information  Into  an  Information  tmaa.  Views  are  than  generated  from  the  Information  base  to  reflect  different  information 
about  software. 

(6)  Reengineering  is  a  catalyst  for  automating  sofhvare  maintenance. 

Most  leengineeiing  tools  follow  the  pattern  of  Figure  S.  They  are  essentially  repositories,  with  specialized  ways 
to  get  information  into  and  out  of  them.  An  important  part  of  this  is  parsed  information  stored  in  the  information 
base.  The  parsed  infotmation  is  valuable  for  analysis,  automation,  and  research  in  software  maintenance.  The 
framework  in  Figure  5  also  {Movides  clear  points  of  evolution  within  any  of  its  parts. 

(7)  Reengineering  is  a  catalyst  for  applying  artificial  intelligence  (Ai;  techniques  to  solve  software 
reengineering  problems. 

As  discussed  earlier,  reengineering  has  a  basis  in  transformations.  Historically,  the  field  of  automated 
transformations  is  an  outgrowth  of  work  in  AI,  such  as  the  rule-based  production  systems  [Davis761  and 
language  processing.  Earlier,  transformations  grew  out  of  formal  logic  and  mathematical  rewrite  systems. 
Reengineering  is  supplying  AI  workers  with  a  fruitful  field  in  which  to  apply  their  work  (e.g.,  [Rich90], 
[Hartman91]). 

FdrtheprBcedingieasons,ieengineetinghasbeensignificantandlikelywilliemainsoduringthenext5  -  ISyears. 
Experience  with  reengineering  will  proNide  ways  to  improve  software  development  and  maint^iance.  Ideally, 
progress  in  reengineeiing,  development,  and  maintenance  wiU  lessen  the  need  to  reengineer  in  the  first  place. 

Reengineering  Technoiogy 

There  are  several  themes  underlying  software  reengineering  technology:  improving  software,  understanding 
software,  and  capturing,  preserving,  and  extending  knowledge  about  software.  Table  1  shows  these  themes 
and  their  associated  technology.  The  following  sections  discuss  these  technology  areas.  They  briefly 
describe  each  technical  area  and  its  significance,  and  list  two  or  more  references  for  further  study.  The  reader 
will  find  the  discussion  useful  for  quickly  appreciating  reengineering  technology.  This  discussion,  being 
brief,  is  not  comprehensive  of  all  reengineering  technology  or  all  relevant  work. 
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Table  1.  Reengineering  Technology 

ReengloMring  thmne 

Associated  technology 

Improving  software 

Restructuring 

Redocumenting,  annotating,  updating  documentation 

Reuse  engineering 

Remodularization 

Data  reengineering 

Business  process  reengineering 

Maintainability  analysis,  portfolio  analysis,  economic 
analysis 

Understanding  software 

Browsing 

Analyzing,  measuring 

Reverse  engineering,  design  recovery 

Capturing,  preserving. 

Decomposition 

and  extending  knowledge 

Reverse  engineering,  design  recovery 

about  software 

Object  recovery 

Program  urKferstanding 

Knowledge  bases  and  transformations 

Tedinology  for  improving  software 

Software  restructuring.  Software  restructuring  is  the  modification  of  software  to  make  it  easier  to  understand  cm- 
easier  to  nnaintain  [AiTKdd89b].  Nowadays  the  term  connotes  changing  the  source  code  control  structure.  Restructuring  is 
signiScantbecause  it  is  oc)ectfthedklBst(e.g.,(Bohm66])  and  irtosttefinedreengineering  techniques.  Resmictiititigwasone 
of  the  first  reengineering  tasks  to  be  fully  automated.  Develo{^  automated  restnicturers  has  led  the  way  to  other 
teerrgineeting  tods  [AttK4d90b].  Lifotmadve  references  on  restructuring  are  [AmoklS9b],  [MilleifiT],  and  [CallissSS], 

Redocumenting,  annotating,  and  updating  documentation.  Redocumenting  software  is  the  creation  of  updated, 
correct  information  about  sdbvare.  Redocumenting  code  is  a  transformation  fiom  code  (aivl  other  documents  and 
programmer  knowledge)  into  new  or  iplated  documentation  about  code.  Normally  this  documentation  is  textual  (e.g., 
embedded  comments),  but  it  can  be  graphical  as  well  Sdhvare  improvement  by  updatirtg  docunnentation  (embedded 
c»mmefits,designs,and^)edficatioris)isotieoftheolderTeengineetiitgtechiuques(IHeiiiiiger78],[SneedS4]).Redocumenlation 
isimportantbecausemaintainers  tend  tDdq)endongoodinlinecomments[Gass81]asguidqx»ts  to  what  theoodeisddrtg. 

Annotating  connotes  adding  documentation  to  source  code  when  there  is  little  useful  documentation  to  begin  with. 
Aruiotating  is  particularly  important  for  understanding  assembly  code.  [LandisSS]  is  an  informative  work  on 
redocumentation.  Success  in  redocumentation  depends  heavily  on  automated  tools.  [Philips84]  teUs  a  horror  story 
about  trying  to  perform  redocumentation  without  tools. 

Reuse  engineering.  Reuse  engineering  is  the  modificadon  of  software  to  make  it  more  reusable,  usually  aiming 
to  find  software  parts  and  rebuildthem  so  that  they  can  beputintoareuse  library.  Several  authors  describeprocesses 
for  finding  and  reusing  parts  ([Amold91b],  [C^dieraSS]).  Prospecting  metrics  and  heuristics  are  described  in 
[AmoId9()a],  [Reynolds90],  and  [Caldieta91].  Specific  tedtniques  for  making  existing  software  teusaUe  ate  described  in 
(Bailey90].  Reengineering  code  into  mote  object-oriented  forms  is  related  to  reuse  engineering,  and  is  discussed  below. 

Legacy  systems  is  a  subfield  of  reuse  engineering  that  decomposes  existing  systems  into  objects  and  relationships 
that  may  be  reassembled  (reused)  in  new  systems.  [Theby9 1  ]  discusses  finding  and  storing  objects  and  relationships 
for  legacy  systems.  The  term  “legacy  system”  may  also  refer  to  a  system  of  enduring  value. 
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Remodulariwtion.  Software  remodularization  is  the  changing  of  the  module  structure  of  a  system.  Often  this 
depends  on  cluster  analysis  of  system  component  characteristics  and  coupling  measures.  Recent  work  in  this  area 
is  [SchwankeQl]  and  [SneedSS].  Criteria  for  modularization  are  discussed  in  [Card85]. 

Data  reengineering.  Data  reengineering  improves  a  system’s  data.  Schemas  may  be  reorganized  and  updated, 
multipleschemas  maybe  consolidated  into  one  schema,  datadictionary  entries  maybe  made  semanticallyoonsistent, 
and  invalid  data  may  be  removed.  Data  reengineering  is  often  a  prelude  to  other  tasks,  such  as  migrating  data  to  another 
data  base  management  system.  Informative  tefeienoes  on  data  teengineenng  are  [Ricketts89]  and  [HevneiS9} 

Business  process  reet^ineering.  With  today’s  newer,  flexible  software  architectures  and  information  technology 
automation  possibilities,  there  is  a  trend  to  make  software  fit  the  business,  rather  than  business  fit  the  software. 
Experience  has  shown  that  powerful  {xoductivity  improvements  can  sometimes  come  from  rethinking  the  business 
processes  automated  by  software  ([Hammer901,  [Davenpott90]).  This  rethinking  may  result  in  new  software 
designs  that  can  become  the  basis  for  reengineering,  migrating,  or  evolving  a  software  system 

Maintainability  analysis,  portfolio  analysis,  economic  analysis.  Software  maintainability  anai  ^  important 

for  discovering  what  parts  of  a  system  should  be  teengineefed.  Typically  the  majority  of  maintenance  work  is 
centered  on  a  relatively  few  modules  in  a  system.  Maintainability  analysis  helps  to  locate  the  high  maintenance 
system  parts.  These  parts  have  the  biggest  initial  impact  on  maintenance  costs. 

[PeercySl]  defines  a  methodology  for  determining  program  maintainability.  More  recently,  [Oman92a]  and 
[Oman92b]  describe  what  nudntainability  is  and  metrics  fcM*  assessing  it  To  decide  when  and  where  to  reengineer, 
[Husmann90]  discusses  the  use  of  portfolio  analysis,  [Sneed91]  discusses  the  use  of  cost-benefit  models,  and 
[G)nnell87]  describes  reengineeting  criteria. 

Technology  for  understanding  software 

Browsing,  Browsing  of  software,  such  as  with  a  text  editor,  is  perhaps  the  oldest  means  for  understanding  it. 
Recently  btowsinghas  become  more  advanced,  withflieuseofhypettext[Conklin87]  to  make  connections  between 
related  parts  and  multiple  view  systems  [Qeveland89]  to  provide  different  views  at  the  click  of  a  mouse  (see  Figure 
2).  Cross-reference  tools  are  another  important  part  of  browsing. 

Analyzing,  measuring.  Analysis  and  measurement  are  also  important  technologies  for  understanding  program 
properties  such  as  complexity.  A  large  literature  metrics  has  accumulated  (such  as  [Cote88]).  Relevant 
techniques  for  reengineering  are  program  slicing  (Weis«81],  control  flow  complexity  measurement  [McCabe76], 
coupling  measures  [Myers75],  and  many  others  (for  example,  [Harrison82],  [Rombach89]). 

Reverse  engineeriiig,  design  recovery.  As  indicatedabove,  reverse  engineering  and  design  recovery  gennate  new 
information  about  software,  usually  in  a  different  view.  This  technology  has  become  popular,  but  detennining  some 
kinds  of  design  information  (like  design  rationales)  is  still  quite  risky  [Corbi90]. 

More  commonly,  reverse  engineering  generates  structure  charts  or  data  flow  diagrams  from  source  code.  These 
tools  rely  heavily  on  information  readily  available  or  analyzable  fiom  the  code  itself.  The  January  1990  issue  of 
IEEE  Software  has  a  good  collection  of  papers  on  reverse  engineoing  and  design  recovery. 

Capturing,  preserving,  and  extending  knowledge  about  software 

Decomporition.  Program  decomposition  takes  a  program  and  makes  objects  and  relationships  out  of  it  These 
objects  and  relationships  are  stored  in  an  information  base.  The  objects  and  relationships  facilitate  analysis, 
measiuement,  and  transformation  and  extraction  of  further  information.  Working  on  a  decomposition  rather  than 
directly  on  the  source  code  saves  the  work  of  having  to  parse  the  program  and  create  objects  and  relationships  for 
use  by  a  tool.  This  task,  though  straightforward  for  most  languages,  is  time-consuming  to  solve  from  scratch.  For 
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most  languages  it  is  easier  to  rely  on  off-the-shelf  decomposers,  or  decomposer  generators  such  as  lex  and  yacc  on  Unix. 

Decomposition  is  not  confined  to  reengineering.  It  is  used  in  integrated  programming  support  environments  and 
structurededitors.Goodreferenoesondecompositionare[Chen90],[Lyle88],and[Gopal88]  —  andthereare many  more. 

Object  recovery.  Object  recovery  obtains  objects  from  source  code.  This  allows  one  to  view  previously  nonobject- 
oriented  source  code  in  an  object-oriented  way.  The  object-orientation  (classes,  inheritance,  methods,  abstract  data 
types,  etc.)  may  be  partial  or  complete. 

Migrating  source  code  toot^-oriented  form  has  beemeceiving  attenboa  An  object-oriented  program  stmcture  may  offer 
mcie  piogram  undostanding,  migration,  and  iinpact  reduction  possil^ties  than  nonobject-oriented  code.  There  has  been 
previous  expeiieiKe  with  object  orientation  when  converting  ^^stems  fiom  C  to  C++.  [Bieuer91],  [Dietrich91], 
[Jaoob6on91],  [Dunn91],  and  |Byme91]  describe  recent  experience  and  ideas  for  discovering  objects  in  source  code. 

Prognm  understanding.  Program  understanding  takes  several  forms.  One  is  manual  or  automated  techniques  for 
programmers  to  gain  a  better  understanding  of  the  software.  The  other  is  a  body  of  work  that  stores  information  about 
progtammingairduses  this  information  tofindinstanoesofprogrammingknowledgeinthecode.Undetstandingisevidenoed 
by  the  extent  to  which  the  software  is  matched  with  the  tool’s  base  of  programming  knowledge.  [Robson9 1  ]  gives  a  quick 
overview(tfbo(hfoiTtisofprogiamur)derstar)ding.Instartcesofthelatlerwoikaie[Hattman91],[Rich90],and[HataiKli90]. 

A>tOM'f!et^(rAas'es4zrK//>nRf/{wTnalSio/u:Kriowledgebases(forexanpIe,[Ha!andi90])ai)dpiogiamtransfcnTialions(e.g., 
[Burson90])  are  foundational  to  much  reengineering  tedincdogy  today.  The  information  base,  associated  transformation 
ertgine,  and  programmed  transfonnations  drive  the  power  of  the  reen^neering  took  The  transformations  work  on  program 
graphs  aridobjectgraphs  stored  in  theknowledge  base.  Object-based,  transfonnational  architectures  for  reengirreering  tools 
(e.g.,  [BursortfO])  are  attractive  for  building  new  reengineering  tools  (for  example,  see  [KDzaczynski89)). 

Reengineering  Strategies  and  Risk  Mitigation 

Being  able  to  ^ply  reengineering  technology  is  as  important  as  knowing  what  it  is  and  what  it  can  do.  Much 
experience  has  been  accumulated. 

Reengineering  process 

The  reengineering  process  takes  many  forms,  depending  on  its  objectives.  Sample  objectives  are  code  cleanup, 
redocumentation,  migration,  capture  of  information  in  an  information  base,  and  reengineering  code  for  reuse. 

Case  studies  ate  frequent  sources  of  reengineering  processes.  For  example,  [SlovinPl )  describes  the  improvement 
of  the  modular  structure  and  maintainability  of  a  system  having  high  maintenance  costs.  [Britchet90]  describes 
a  reengineering  feasibility  project  in  which  a  Federal  Aviation  Administration  terminal  approach  control  system 
was  reengineered  to  operate  on  more  modem  hardware  and  in  Pascal  (instead  of  assembly  code). 

Others  have  tried  to  systematize  the  reengineering  process.  Such  an  approach  has  been  described  in  [Ulrich90  - 
91].’rhem£yorprocess  steps  here  are  inventory/analysis,  positioning,  and  tran^ormatioa  The  inventory  and  analysis  phase 
estaUishes  a  sdfhvare  components  base  artd  evaluates  reengineeting  options  based  on  tfris  inventory.  Positioning  irrproves 
sdiware  quality  without  necessarily  affecting  existing  functicxiality  or  ardiitecture.  Positioning  in^ves  the  software  to 
facilitate  change,  or  analysis  to  support  change.  The  transformation  phase  creates  a  new  architecture  fixxn  the  existing  one. 

Reengineeriiig  evaluation 

There  iscoircem  about  theempiricalefrectiveriessof reengineering.  Theevidenoe  for  reengjneeiing  falls  into  anecdotes,  case 
studies,  lessons  learned,  arxl  experiments.  A  good  discussion  of  cost-benefit  analysis  issues  for  reengineering  can  be  found 
in  [Sneed91].  Experiments  such  as  [FSMSC87]  and  [Sneed90]  imply  that  software  restmcturing  and  reengineering, 
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lespectivdy, can  be  helpMMeariy  case  saKfy([Sneed84])  showed  mixedcost-benefitieoilts,  but  this  is  not  unusual  for 
WQfkbeingdonefortheiiisttime.Latercasestudies(i.e.,[Slovin91]and[Britchei90])showedpositiveieengineeringiesults. 

Reengiiieering  risk  analysis  and  mitigation 

Reengineering  is  not  something  that  one  simply  “does.”  It  is  easy  to  waste  time  and  dollars  on  ineffective  rq^pioadies.  As 
pointed  out  in  [UIiich90],one  should  approach  reengineering  withaplaa  The  plan  can  then  beevaluated  and  risks  assessed. 

[AmoId91a]  cataloged  several  typical  reengineering  risk  areas,  associated  risks,  and  mitigations.  Table  2  shows 
some  of  these  areas  and  their  associated  risks. 


Table  2.  Reengineering  Risks 

(from  [Amold91a]) 

Risk  ArM 

Risks 

Process  Risks 

Extremely  high  manual  reengineering  costs. 

Cost  benefits  not  realized  in  required  time  frame. 

Cannot  economically  justify  the  reengineering  effort. 

Reengineering  effort  drifts. 

Lack  of  management  commitment  to  ongoing  reengineering  solution 
[Ulrich90a]. 

Personnel  Risks 

Programmers  inhibiting  the  start  of  reengineering. 

Programmers  performing  less  effectively  to  make  an  unpopular 
reengineering  project  look  less  effective. 

Application  Risks 

Reengineering  with  no  local  application  experts  available. 

Existing  business  knowledge  embedded  in  source  code  is  lost 
([Koka91],  (Ulrich90a]). 

Reengineered  system  does  not  perform  adequately  [Koka91]. 

Technology  Risks 

Recovered  information  is  not  useful  or  not  used  (ChikofskySlj. 

Masses  of  (expensive)  documentation  produced. 

Reverse  engineering  to  representations  that  cannot  be  shared. 
Reengineering  technology  inadequate  to  accomplish  reengineering  goals. 
Reengineering  where  there  is  little  reengineering  technology  support. 

Tools  Risks 

Dependence  on  tools  that  do  not  perform  as  advertised. 

Not  using  installed  tools. 

Strategy  Risks 

Premature  commitment  to  a  reengineering  solution  for  an  entire 
system. 

Failure  to  have  a  long-term  vision  with  interim  goals  [Ulrich90a]. 

Lack  of  global  view:  code,  data,  process  reengineering. 

No  plan  for  using  reengineering  tools  [Ulrich90a]. 
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The  message  here  is  to  tespect  reengineering  and  have  contingencies  for  mitigating  risk.  In  nuny  cases  mitigation 
means  trying  reengineering  on  a  small  scale  (e.g.,  a  small  subset  of  programs)  and  assessing  risks  before  committing 
to  a  wholesale  reengineering  effort. 

Future  Advances 

Several  advances  in  reengineering  technology  can  be  expected,  aiKl  are  discussed  briefly  below.  Some  of  the  areas 
are  already  supported,  and  arc  being  extended  with  CASE  tools.  Often  CASE  tool  vendors  are  leading  the  way  in 
(internal)  research  because  of  the  product  infrastructure  that  they  own.^ 

Software  maintainability  measurement 

There  is  interest  in  developing  suites  of  metrics  for  finding  hard-to-maintain  code.  This  code  could  be  a  prime 
candidate  for  reengineering.  Several  maintainability  metrics  and  frameworks  exist  (e.g.,  [Amold83], 
[Amold82],  [PeercySl]).  However,  more  recently,  newer  maintainability  frameworks  have  emerged 
([Oman92a].  [Oman92b]). 

Reengineering  cost  ben^t  models 

A  medium  or  large  reengineering  effort  often  requires  cost-benefit  justification.  There  is  a  need  for 
organizations  to  input  their  maintenance  parameters  and  get  a  reasonable  estimate  of  the  costs,  benefits,  and 
timeframe  for  payoff.  More  work  is  needed  in  making  these  estimation  tools  available  ([AmoId89], 
[Sneed91],  [Sittenauei92]). 

Expert  systems  for  reengineering  tasks 

Much  expertise  now  exists  about  reengineering.  A  logical  next  step  is  to  capture  this  expertise  in  an  expert  system 
rules  base.  The  rules  base  can  then  be  embedded  into  an  existing  CASE  tool  to  enhance  the  value  of  the  tool  for 
users.  Example  areas  where  expert  systems  can  be  useful  are  prospecting  for  reusable  parts  in  source  code 
[Knight92],  finding  objects  in  source  code,  and  remodularizing  systems. 

Reverse  engineering  into  models 

Model-based  maintenance  deals  mainly  with  nonprocedural  diagrams  that  together  model  an  application.  The 
source  code  is  generated,  usually  withaCASEtool,  from  the  diagrams.Maintenanceof  source  code  is  done  through 
the  diagrams,  not  directly  on  the  source  code.  This  maintenance  paradigm  holds  great  promise  for  reducing 
maintenance  costs  and  facilitating  software  evolution. 

CASE  tools  already  exist  for  model-based  maintenance.  Methodologies  have  been  developed  and  are  being  refined 
for  migrating  existing  systems  into  model-based  systems.  There  is  room  forextending  and  automating  parts  of  this 
work. 

Scrftware  process  instrumaitation  for  maintenance 

The  software  process  needs  to  be  instrumented  to  capture  software  change.  This  will  allow  change  histories 
for  software  to  be  animated  and  “played  back”  to  maintainers.  This  goes  beyond  configuration  management. 
The  change  histories  must  be  captured  in  ways  that  semantically  make  sense  to  maintainers  that  are  browsing 
them. 


Trying  to  create  a  CASE  tool  infrastructure  from  scratch,  without  tool-building  tools,  can  be  time-consuming  and  expensive. 
However,  powerful  tool-building  technology  and  off-the-shelf  fools  are  now  available  that  can  speed  the  creation  of  a  tool 
infrastructure. 
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Summary 

The  major  points  made  here  are 

( 1 )  Software  reengineering  is  any  activity  designed 

•  to  improve  one’s  understanding  of  software,  or 

•  to  improve  the  software  itself,  usually  fur  increased  maintainability,  reusability,  or  evolvability. 

(2)  There  is  no  universally  accepted  definition  of  reengineoing.  Several  definitions  exist  that  reflect  different 
pospectiveson  what  leengineoing  is.  The  approachembodied  in  the  section  titled, ‘The  Context^^Reengineeiing,” 
and  Appendix  A  can  be  used  to  more  precisely  charactetizB  definitions  of  reengineefing,  revose  engineering,  and 
other  terms. 

(3)  Transformation  is  central  to  reengineering  and  related  terms.  Key  elements  of  transformations  are  views  of 
software,  an  information  base,  decomposition  of  software  information  into  objects  and  relationships  in  an 
information  base,  and  composition  of  views  fixxn  information  in  the  information  base. 

(4)  There  are  several  reasons  why  reengineering  is  significant: 

•  Reengineering  can  help  reduce  an  organization’s  evolution  risk. 

•  Reengineering  can  help  an  organization  recoup  its  investment  in  software. 

■  Reengineering  can  software  easier  to  di^ge. 

•  Reengineering  is  big  business. 

•  Reengineering  crftabUity  amplifies  CASE  tcxdsets. 

•  Reengineering  is  a  catalyst  for  automating  software  maintenance. 

•  Reengineering  is  a  catalyst  for  inlying  artificial  intelligence  (AI)  techniques  to  solve  software 
reengineering  problems. 

(5)  Reen^neering  technology  has  three  basic  tiiemes: 

•  Lnqxoving  software, 

•  Understanding  software,  and 

•  Oq)turing,  [xeserving,  and  extending  knowledge  about  software. 

(6)  Reengineering  risks  exist,  but  they  can  be  planned  for  and  mitigated. 
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Appendix  A 

A  decision  procedure  for  classifying  a  software  transformation 

To  decide  if  a  given  transformation  (or  procedure),  transforming  infonnation  in  view  D  into  information  in  view 

C,  is  reengineering,  reverse  engineering,  design  recovery,  or  forward  engineering,  do  the  following: 

1 .  Put  view  D  and  view  C  into  one  of  the  view  classes  ( 1 , 2, 3,  A 1 ,  A2,  A3).  (See  Figure  1  and  the  discussion 
of  it  in  the  section  titled,  “The  Context  of  Reengineering”.)  If  D  and  C  cannot  be  classified,  then  this 
decision  procedure  cannot  be  used.^ 

2.  Classify  the  transformation  according  to  the  following  table.  A  label  of  “open”  in  the  table 
means  the  transformation  is  not  classified.  (A  transformation  not  listed  in  the  table  is  also  “open.”)  It 
may  or  may  not  be  an  instance  of  the  other  terms.  More  information  is  needed. 

Example 


Each  example  is  assumed  to  meet  the  necessary  conditions  of  Table  A2. 


Table  A1.  Example  Transformation  Classifications 

Transformation 

view  0,  Class 

View  C.  Class 

Classification  from  Table 

Source  code--> 

Decision  table 

Source  code,  3 

Decision  table,  1 

Reverse  engineering,  design 
recovery,  reengineering 

Data  flow  diagram 
->  Textual 
specifications 

Data  flow 
diagram,  2 

Textual 

specifications,  1 

Reverse  engineering,  design 
recovery,  reengineering 

Source  code  --> 

Lines  of  code 

measurement 

Source  code,  3 

Lines  of  code 
measurement,  A3 

Open 

Source  code  --> 

Pretty  printed 
source  code 

Source  code,  3 

Pretty  printed  source 
code  (same  program 
and  language),  3 

Restructuring,  reengineering 

Source  code 

inline  comments  --> 

Revised  source 

code  inline  comments 

Source  code  inline 

comments,  3 

Revised  source  code 
inline  comments,  3 

Reengineering 

Raw  data  in 

database  -'> 

Logical  data 
schema 

Raw  data  in 

database,  3 

Logical  data 
schema,  1 

Reverse  engineering,  design 
recovery,  reengineering 

'It  is  also  assumed  that  the  classification  has  intuitive  validity.  For  example,  source  code  is  not  placed  view  class  1, 
nonprocedural  specifications. 
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Table  A2.  Classification  of  Transformations  in  Figure  1. 


other  conditions* 


Classification 

View  class  of  0 

View  Class  of  C 

Forward 

engineering 

1 

2.  A2.  3.  or  A3 

Open 

1  or  At 

At 

Reengirteering 

1 

1 

Forward 

engineering 

At 

1,2.A2,  3,  or  A3 

Reverse 
er>gineerir>g. 
design  recovery, 
reengineering 

2 

1  orAI 

Reengirteering 

2 

2 

Open 

2orA2 

A2 

Forward 

engineering 

2 

3  or  A3 

Reverse 

engineering 

A2 

1 

Open 

A2 

At 

Forward 

engineering 

A2 

2,  3,  or  A3 

Reverse 
engirteering, 
design  recovery, 
reengineering 

3 

1.2,A1,orA2 

Reengineering 

3 

3 

Restructuring 

3 

3 

Open 

3orA3 

A3 

Forward 

engineering 

A3 

3 

Reverse 
engineering, 
design  recovery, 
reengineering 

A3 

2orA2 

V,  and  both  D  and  C 
are  source  code  in 
the  same  lartguage 
for  the  same 
program. 


‘Note:  V  means  that  the  transformation  meets  the  following  conditions: 

(1)  Improves  one's  understanding  of  software,  or 

(2)  Prepares  or  improves  the  software  itself,  usually  for  increased  maintainability,  reusability, 
or  evolvabUity. 

The  “or*  is  an  inclusive  “or." 
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Charles  Rogers  200 
Jay  Roske  97 

S 

Andrew  P.  Sage  1 
Charles  H.  Sampson  143 
Antonio  L.  Samuel  44 

R.  D.  Semmel  192 

S.  Wayne  Sherer  120 
Joao  Silva  67 
Chris  Sittenauer  373 
Dennis  Smith  361 
Robert  A.  Steigerwald  327 


Barry  Stevens  216 
Alexander  D.  Stoyenko  44 
Sherry  Stukes  216 
Frank  Svoboda  355 
Ricky  E.  Sward  327 

W 

Lx)nnie  R.  Welch  44 
Daniel  E.  Wilkening  83 
Richard  W.  Williams  97 
R.  Winkler  192 
Ed  Woods  200 

Y 

Darin  York  200 
Z 

John  J.  Zenor  334 


402 


